Great work guys! Are these going to only be computer vision nets? I think the hu...

kastnerkyle · on Sept 10, 2014

Support is planned for audio and hopefully text - I am working on building a million song dataset to recreate the work Sander Dieleman did for Spotify, and have had some possible support in October for the weights of a trained speech network! So yes, feature extraction from other domains should be on the horizon.

We are specifically trying to make it easy to say: I want to transform an image (or audio, etc.) using a pretrained net. Download the weights, extract the features for me, and give me the feature vectors so I can do something with those. This seems to be really, really, really hard in all the tools I have used and usually involves training yourself, which is not very useful for things on the level of ImageNet.

Of course, having nice examples and good docs is one of the great parts of scikit-learn, and is actually one of the things I have been working on most recently. Our docs aren't to that level yet, but I hope they can be one day.

DeCAF (precursor to Caffe) and OverFeat binaries were really kind of the first in this regard (about 1 year ago now), but IMO one of the limitations is that they lose interaction with the rest of the Python ecosystem for data munging, and simple algorithms for exploration. By wrapping the weights, we hope to leverage the support of the Python ML ecosystem easily, while still being able to use the power of these networks.

Right now the most compelling use case is as part of a scikit-learn pipeline i.e. make_pipeline(OverfeatTransformer, LinearSVC) or whatever. Feed images in, get predictions out. I am also working on a demo of "writing your own twitter bot" similar to https://twitter.com/id_birds , written by Daniel Nouri . I like sloths, so it will definitely be a slothbot.

I also hope to support recurrent architectures from Groundhog (https://github.com/lisa-groundhog/GroundHog) as several researchers here at the LISA lab have been using it to get pretty amazing results in NLP and audio, both of which are potential targets in the future. If we can leverage their work, it would be a very nice way for people to immediately play with SOTA architectures in different applications.

In any case, just loading in weights and extracting image features easily is nice, and was a benefit for both Michael and myself in a research project this summer.

agibsonccc · on Sept 10, 2014

Great to hear! That's exactly what I'm trying to replicate as well. I'm mainly trying to do it for industry myself. Not a lot of people like making this stuff for the JVM ecosystem (understandble of course...I love python as well)

I also agree about caffe as well. The python ecosystem is amazing and should be leveraged which also increases adoption.

As I said before, being able to do this at scale for people where their data is stored on the JVM should help it make it more accessibble to a lot of people.

Re: Twitter bot. This looks really cool.!

I'll be keeping an eye on developments here. Good stuff!