• Language Understanding and Learning and AI

    It’s been over 2 years since my last post on building mobile AI apps with TensorFlow. A quick recap of what I’ve been up to during the time, to justify a little bit my mission statement “Building True AI with NLU and Common Sense”:

  • Bilingual French Classics iOS App

    Just developed and released a Bilingual French Classics app, as a convergence of my passions for language, AI and mobile, to kill the pain I was having while learning French one hour a day for about two years. The app offers a fun, addictive and effective way to learn French, by letting you read timeless French classics in the bilingual format to keep you in the flow. Improving your vocabulary significantly and naturally this way can work magic to your other language skills, including watching the French captions of your favorite Netflix movies or TV series.

  • Building Mobile AI Apps with TensorFlow

    Update June 12, 2018: The book has been officially published and the complete source code is available here. I’m honored to have Pete Warden, the lead of Google TensorFlow mobile team as one technical reviewer of the book, and Aurélien Géron, the best-seller author of Hands-On Machine Learning with Scikit-Learn and TensorFlow, as the Foreword author who kindly writes about my book - “This is going to be a super popular book. It’s such an important topic, and it’s hard to get good reliable information.”

  • Object Detection: From the TensorFlow API to YOLOv2 on iOS

    Late in May, I decided to learn more about CNN via participating in a Kaggle competition called Sealion Population Count. First, I came across this nice Notebook by Radu Stoicescu: Use keras to classify Sea Lions: 0.91 accuracy, and there’s a statement in it that says “This is the state of the art (object detection) method at the moment: https://cs.stanford.edu/people/karpathy/rcnn/”. So I checked out Andrej Karpathy’s Playing around with RCNN, State of the Art Object Detector and the original RCNN paper, not realizing it was state of the art as of 2014, until days later I watched the Stanford CS231n lecture 8 video Spatial Localization and Detection by Justin Johnson again (somehow the first time I watched it months ago it didn’t leave me any impression; maybe I just fell asleep). It’s a great video and it talked about better (more state of the art, as of Feb 2016) object detection models after RCNN: Fast RCNN, Faster RCNN, and YOLO. So I spent a few more days reading the papers and looking at some github repos implementing the models.

  • How to Develop a Prisma-like iOS App with Offline Art Filters

    Update March 8, 2018: I’m busy writing a book on building iOS and Android apps with TensorFlow and one of the chapters I have completed writing has updated info on this model as well as a detailed tutorial of using the TensorFlow multiple styled model (stylize_quantized.pb from the official TensorFlow 1.4/1.5/1.6 Android example apps) on iOS. The TensorFlow 1.4/1.5/1.6 stylize_quantized.pb model is faster and much more powerful than the model described in this blog. So I strongly recommend using stylize_quantized.pb on iOS and Android. If interested, please check out the book, which is on early access with 14-day free trial, for details.

  • TensorFlow Deep Learning Machine ezDIY

    If you’re thinking of buying or building your own deep learning machine with some nice GPU for training, you may have come across Andrej Karpathy’s tweet about his deep learning rig build, which was a little outdated, being published in Sep. 2015, or more recently Lukas Biewald’s Build a super fast deep learning machine for under $1,000, published on Feb. 1, 2017. I happened to start exploring training some CNN and RNN models since mid-January on my own machine, and am pretty happy with what I ended up with - a very cost effective Dell PC (Intel i7-6700 Processor 3.4GHz CPU, 16GB memory, 2TB Sata hard drive, 4GB Nvidia GTX 745) for $849 and a Nvidia GTX 8GB 1070 GPU for $388 - the total is $1237, $237 more than Lukas’s machine but his GPU is 3GB GTX 1060 (about $200 less than 8GB GTX 1070, and 1070 is about 50% faster than 1060), 8GB memory ($129), 1TB hard drive ($50), and Intel i5-6600 3.3GHz CPU. So with $237 more, you get the value of about $400, plus you just need to replace the GPU instead of building the whole machine from scratch, and the Dell machine comes with Windows 10 Professional, which may sound not too sexy or geeky but its remote desktop feature can definitely make your life more enjoyable (I’ll talk more about this later).

  • Reinforcement Learning in Tic-Tac-Toe

    Different people may learn in different ways. Some prefer to have a teacher, a mentor, a supervisor, guiding them on each step in their learning process; others are more like self-learners. Supervised learning no doubt is very important, but I found reinforcement learning (RL) fascinating because of its self learning flavor - a program called TD-Gammon developed in 1992 learned how to play backgammon via hundreds of thousands of self-play games and was then able to beat the best human player. The famous AlphaGo also used RL - see Jim Fleming’s Before AlphaGo there was TD-Gammon for more info.

  • What Kind of Dog Is It - Using TensorFlow on Mobile Device

    Even before I had my first dog, a Labrador Retriever, in June 2015, while walking and seeing a dog I often wondered what kind of breed it is. About a year ago, I found the Stanford Dogs Dataset and then asked a friend who had a Ph.D. in computer vision from CMU if it’s possible to use it and some machine learning algorithm on my iPhone and reach a recognition precision of about 80% or 90%, and this is what he told me: “80-90% will be really hard, unless you are willing to restrict your problem in some way. For example, request a user to take a picture in some specific angle, or reduce the number of classes.” Also he said that “for deep learning to work, you will need a lot more data (than the Stanford Dogs Dataset, which has about 100-200 images for each dog breed) to train the neural network”.