• Object Detection: From the TensorFlow API to YOLOv2 on iOS

    Late in May, I decided to learn more about CNN via participating in a Kaggle competition called Sealion Population Count. First, I came across this nice Notebook by Radu Stoicescu: Use keras to classify Sea Lions: 0.91 accuracy, and there’s a statement in it that says “This is the state of the art (object detection) method at the moment: https://cs.stanford.edu/people/karpathy/rcnn/”. So I checked out Andrej Karpathy’s Playing around with RCNN, State of the Art Object Detector and the original RCNN paper, not realizing it was state of the art as of 2014, until days later I watched the Stanford CS231n lecture 8 video Spatial Localization and Detection by Justin Johnson again (somehow the first time I watched it months ago didn’t leave me any impression; maybe I just fell asleep). It’s a great video and it talked about better (more state of the art, as of Feb 2016) object detection models after RCNN: Fast RCNN, Faster RCNN, and YOLO. So I spent a few more days reading the papers and looking at some github repos implementing the models.

  • How to Develop a Prisma-like iOS App with Offline Art Filters

    [Update June 13, 2017: Based on user feedback, I’ve updated the code and this blog after testing steps 2-6 on both TensorFlow 0.12 (again) and TensorFlow 1.1, and running the iOS app in step 7 with the TensorFlow 1.0 source.]

  • TensorFlow Deep Learning Machine ezDIY

    If you’re thinking of buying or building your own deep learning machine with some nice GPU for training, you may have come across Andrej Karpathy’s tweet about his deep learning rig build, which was a little outdated, being published in Sep. 2015, or more recently Lukas Biewald’s Build a super fast deep learning machine for under $1,000, published on Feb. 1, 2017. I happened to start exploring training some CNN and RNN models since mid-January on my own machine, and am pretty happy with what I ended up with - a very cost effective Dell PC (Intel i7-6700 Processor 3.4GHz CPU, 16GB memory, 2TB Sata hard drive, 4GB Nvidia GTX 745) for $849 and a Nvidia GTX 8GB 1070 GPU for $388 - the total is $1237, $237 more than Lukas’s machine but his GPU is 3GB GTX 1060 (about $200 less than 8GB GTX 1070, and 1070 is about 50% faster than 1060), 8GB memory ($129), 1TB hard drive ($50), and Intel i5-6600 3.3GHz CPU. So with $237 more, you get the value of about $400, plus you just need to replace the GPU instead of building the whole machine from scratch, and the Dell machine comes with Windows 10 Professional, which may sound not too sexy or geeky but its remote desktop feature can definitely make your life more enjoyable (I’ll talk more about this later).

  • Reinforcement Learning in Tic-Tac-Toe

    Different people may learn in different ways. Some prefer to have a teacher, a mentor, a supervisor, guiding them on each step in their learning process; others are more like self-learners. Supervised learning no doubt is very important, but I found reinforcement learning (RL) fascinating because of its self learning flavor - a program called TD-Gammon developed in 1992 learned how to play backgammon via hundreds of thousands of self-play games and was then able to beat the best human player. The famous AlphaGo also used RL - see Jim Fleming’s Before AlphaGo there was TD-Gammon for more info.

  • What Kind of Dog Is It - Using TensorFlow on Mobile Device

    Even before I had my first dog, a Labrador Retriever, in June 2015, while walking and seeing a dog I often wondered what kind of breed it is. About a year ago, I found the Stanford Dogs Dataset and then asked a friend who had a Ph.D. in computer vision from CMU if it’s possible to use it and some machine learning algorithm on my iPhone and reach a recognition precision of about 80% or 90%, and this is what he told me: “80-90% will be really hard, unless you are willing to restrict your problem in some way. For example, request a user to take a picture in some specific angle, or reduce the number of classes.” Also he said that “for deep learning to work, you will need a lot more data (than the Stanford Dogs Dataset, which has about 100-200 images for each dog breed) to train the neural network”.