mcottondesign

Loving Open-Souce One Anonymous Function at a Time.

Making Python use all those Cores and RAM

It is cheap and easy to build a machine with 8/16 cores and 32GB of RAM. It is more complicated to make Python use all those resources. This blog post will go through strategies to use all the CPU, RAM, and the speed of your storage device.

I am using the AMD 3700x from my previous post. It has 8 cores and 16 threads. For this post I will be treating it each thread as a core because that is how Ubuntu displays it in System Monitor.

Looping through a of directory of 4 million images and doing inference on them one by one is slow. Most of time is waiting on system IO. Loading the image from disk into RAM is slow. Transforming the image once it is in RAM is very fast and making an inference with the GPU is also fast. In that case Python will only be using 1/16th of the available total processing power and only that single images will be stored in RAM. Using a SSD or NVME device instead of a traditional hard drive does speed it up, but not enough.


Loading images into RAM is great but you will run out at some point so it is best to lazy load them. In this case I wrote a quick generator function that takes an argument of the batch size it should load.


Dealing with a batch of images is better than loading them individually but they still need to be pre-processed by the CPU and placed in a queue. This is slow when the CPU is only apple to use 1/16th of its abilities.


Using the included multiprocessing package you can easily create a bunch of processes and use a queue to shuffle data between them. It also includes the ability to create a pool of processes to make it even more straight forward.

In my own testing, my HDD was still slowing down the system because it wasn't able to keep all of the CPU processes busy. I was only able to utilize ~75% of my CPU when loading from a 7200RPM HDD. For testing purposes I loaded up a batch on my NVME drive and it easily exceed the CPUs ability to process them. Only having a single NVME drive I will need to wait for prices to come down before I can convert all of my storage over to ultra-fast flash.

Using the above code you can easily max out your RAM and CPU. Doing this for batches of images means that there is always a supply of images in RAM for the GPU to consume. It also means that going through those 4 million images won't take longer than needed. Next challenge is to speed up GPU inference.

Curating Datasets is the New Programming

Machine learning has changed how I approach new programming tasks. I am going to be working through an example of detecting vehicles in a specific parking space. Traditionally this would be done by looking for motion inside of a specific ROI (region of interest). In this blog post I will be talking through achieving better results by focusing on dataset curation and machine learning.

The traditional way to detect a vehicle would be to try and catch it at a transition point, entering or exiting the frame. You could compare the video frames to see what has changed and if an object has entered or exiting the ROI. You could set up an expect minimum and maximum object size to improve accuracy and avoid false positive results..

Another way would be to look for known landmarks at known locations such as lines separating spaces or disabled vehicle symbol. This would involve some adjustment for scaling and skew but it could have a startup routine that was able to adapt to reasonable values for both. It wouldn't be detecting a vehicle, but there is a high probability that if the features are not visible then the spot is occupied. This should work at night by choosing features that have some degree of visibility at night or through rain.

A third technique could be similar to the previous example expect looking at the dominant color instead of specific features. This would not work at night when the camera switches to IR illumination and an IR filter. This sounds like a lazy option, but in specific circumstances it could perform quite well for looking for brown UPS vehicles that only make deliveries during daylight hours

In the traditional methods, you would look at the image with OpenCV and then operate on pixel data. Some assumptions would need to made on how individual pixels should be group together to form an object. The ROI would need to be defined. Care would also need to applied when handing how individual pixels create a specific feature, or are composed a color between our threshold values. All of this has been done before.

.

The approach I am advocating for is to not program anything about the image at all. Each frame would be looked at and the model would make a prediction regardless of motion, features, or dominant colors. Such a general approach could easily be marketed as AI. I prefer to call it machine learning to clarify that the machine is doing the work and that it's only intelligence is in its ability to compare to known images. Think of it as a graph with a line that separates it into categories. The algorithm's only job is to predict where to place each image on the correct side of the line.


As more example images are added to the model, the line separating the groups of images changes to accomodate the complete dataset. If you are not careful to keep the two groups balanced than biases in the datasets can appear. Think of this as a simple programming bug.


The first image is considered in the parking space, the second image is not. In both cases, the vehicle is blocking the parking space. The difference is arbitrary, but these special cases can be handled by including examples in the correct dataset. Think of what goes into the datasets as domain specific rules.


It is easy to build a dataset based on images evenly spaced through the day. Doing this should give good coverage of all lighting conditions. This method also trains the model with a potential hidden bias. Because there are more cars parking during the day, it is more likely the model will learn that night time always means there is no car. The image on left shows a person walking and gets a different result than a nearly identical image during the day. The image on the right is included to balance the dataset to account for this bias.

The line separating the two groups is getting more complicated and would be very difficult to program using traditional methods.


These final examples show that the model must be able to account for unexpected events such as heavy rain or a car parked sideways. Handling unexpected data like this using the traditional methods would require significant rework.


In building this model, the programming stayed the same. It is all about the data and the curation of the datasets. The future of programming will be based on curating datasets and less about hand coding rules. I'm very excited for this change in programming and the new applications that can be made through it. We will need our tooling to catch up and I am excited to be working on it.

Questions, comments, concerns? Feel free to reach out to me at mcotton at mcottondesign.com

Building an AI/ML workstation in 2020

The cloud is a great way to get started with AI/ML. The going rate is around $4/hr for a GPU instance. It might be all that you need, but if you need to maximize your personal hardware buget this is my guide to buiding workstation.

Price

I spent ~$1000 and you could get it even lower with the current deals. I decided on a AMD Ryzen 3700X, Nvidia 2060 Super, 32GB of RAM, and an NVME drive. I could connect all of this into a B450m motherboard so I didn't see any reason to spend more. I also include another SSD for Windows 10 and two HDDs for storing datasets.

CPU

The Ryzen 3700x is far more than I need and most of the time several cores are idle. Because most of my tooling is in python it is a real struggle to make use of the resources. The multiprocess library is great and it isn't too complicated to make a Pool of CPU cores for the program to use.

GPU

The 2060 Super is in a very odd position in the current lineup. It is the cheapest model with 8GB of VRAM, to get 3GB more you have to 3x the price. It makes more sense to jump up to 24GB for 6x the price. Training CNN models on image data requires significant amounts of VRAM. You could spend more but without more VRAM the money would only provide marginal improvements due to the number of CUDA cores.

RAM

System memory is very similar to CPU cores, you will rarely max it out but it never hurts to have more avialable. I typically am using 20GB which means that I could reduce my batch size and done with 16GB or I could just been the extra $80.

STORAGE

I'm using a 1TB NVME drive. It is very fast and tucks away nicely on the motherboard. It is overkill for my needs but it is nice to have the space to store millions of images and not wait on traditional HDDs.

Speaking of HDDs. I have two, a 1TB and a 2TB. The 1TB is just for datasets and so that I can keep space available on the faster drive. The second drive is for automatic backsups of the main disk and datasets. Backups are also doing to two different NASs but that is another blog post.

It would be a shame to have this machine and not play games on it. I'm using an inexpensive 1TB SSD for Windows 10. I don't like dual booting off of a single drive. I prefer using the bios boot selector to switch between OSes.

COOLING

The case I'm using has four 120mm fans. I added a giant Nactua heat sink with two additional 140mm fans. Adjust the motherboard fan curves so that everything is nice and quiet. I believe in having a lot of airflow and I haven't had any temperature problems (inside the case, the damn thing very noticably heats up my office).

UPGRADES

I'm currently happy with the way it is. The obvious upgrade will be for more GPU VRAM. I decided against water-cooling but if I get to a place with multiple GPUs that appears to be the best solution.

Blinking a light when Eagle Eye camera detects motion

This is a quick video on how to blink a light when an Eagle Eye camera detects motion. There are plenty of ways to do this, and this is just the way I decided based on what I had available on my desk.

I used the following parts:

  • Raspberry Pi 2B
  • Arduino Uno
  • one red LED
  • 3D printed Eagle Eye logo
  • EE-blinker that runs on the Raspberry Pi
  • blink_sketch that runs on the Arduino

The Arduino was configured with the red LED on pin 13. The sketch looks for serial communication (over the USB port from the Raspberry Pi) of an ASCII character. If the character is '1' then it turns on, if it is '2' then it turns off.

As an aside, why didn't I make it 1 or 0? Because originally I was planning on having multiple states and the states did a sequence of things.

The Raspberry Pi is just running a Node.js program and it could be replace by any other computer. The program itself could also be replaced with any other language to subscribe to the Eagle Eye API.

You can see the code for the entire project here. Make sure to read the README file and follow the instructions to create an appropriate `config.js` file. It also contains the correct serial port for connecting to the Arduino so make sure you have that correct.

The above example is listening for motion events. Specifically it is listening for ROMS and ROME events (ROI motion start, ROI motion end). It could easily be adapted to use other events.

If you are monitoring for status changes (camera online/offline, recording on/off, streaming on/off) you most likely want to listen the status bitmask directly. The events are delayed because we go through our own internal heuristics to filter out false positives.

You can find an example of subscribing to the poll stream, getting the status bitmask, and parsing it in this example project.

Detecting an open garage door with Machine Learning, Computer Vision, and Eagle Eye

Introduction

I wanted to know if the garage door is left open. This is how I trained my computer to text me when it has been open for too long.

Detecting if a garage door is open or closed is a surprisingly challenging problem. It is possible to do this with contact sensors or even with a motion (PIR) sensor, but that requires running additional wires and a controller.

My solution was to put a wireless IP camera to the ceiling. It connects to my home network using wifi and plugs into the same power outlet as the garage door opener.

Traditional video security analytics such as motion or line cross does not adequitely detect the door being left open. Both motion and line cross depend on something happening to trigger. This is different because I am looking the position of the garage door instead of movement.

I also looked at a traditional computer vision solution to detect sunlight coming in throuogh the open door. Ultimately trying to detect based on the 'brightness' of the scene would not work in this situation. The way the camera is position it would not be able to differential bewteen the light from the ceiling and sunlight coming in through the garage door. This also would not work at night or during a storm. A different tool is needed.

I wanted to experiment with creating a machine learning model. This is a great first project to get start with.

Continue Reading

Apple Watch Charging Stand

This is a present I made for a friend. It makes a handle travel stand because it is cheap to produce, light weight, and snaps together with magnets.

You can download the STL files on Thingiverse

3D printed bomb drop kit

This is a kit that I designed and printed. Each bomb bay requires a channel on your RC receiver. In this video I am using an 8-channel receiver but you easily disconnect your rudder and use that while you fly the plane with just bank and yank.

You'll be able to download the STL files.

Why are coders afraid of writing code?

At a previous startup, I had a great idea that we could improve our product with realtime updates from the server. At the time we had implemented long-polling in the client. After failing to get everyone on board I decided to take a weekend to create an example with socket.io and Node.js.

On Monday morning I was proud to show it off around the company. The CEO liked the idea and agreed with me. Yay! We got on a call with the CTO and I proudly explained my work.

He looked at it, then asked me to go into further detail about my choices. How did I implement this? Tell him about websockets? How can we roll it into productions? What does it look like at the HTTP layer? Well, I could sort of answer his questions. I had to confess that I really didn't know and I used socket.io and it uses websockets.

He probed a little more and then started to get impatient. I couldn't believe that he had become dismissive. And worse, the CEO followed his lead. I hadn't figured it out yet but I had made a critical mistake.

Would we now be implementing the socket.io abstraction? Is this what we want? What trade off does that mean for us? Is this the only way?

I was afraid to learn how the technology actually worked and I was asking the company to take in several major dependencies in order to implement this.

I stubbornly maintained my position. Why would we skip over an existing answer? I had proved it was working. They could see it working. Why were they not as excited as I am?

We were at an impasse and I still hadn't figured it out yet.

Fast-forward to now, and I am a first time CTO. I've spent a lot of time thinking about lessons learned from my friend, the CTO in the previous story. I do my best to consider why he made the choices he did.

I'm interviewing candidates and I'm starting to see a similar pattern with other developers. They gladly accept dependencies and baggage that limits options and forces a direction. They'll make a terrible choice to avoid doing something that makes then uncomfortable.

Coders are afraid of writing code.

A fiend of mine recently described how their company need to do multiple events from an external webhook. Not enough people understand a simple web server so it got pushed to DevOps to build using salt. Salt isn't a horrible answer but when he was telling me about this I could see a little of my inexperienced self in the story.

On another project I was dealing with an Indian iOS developer that needed to sort an array. They fought me when I filed a bug that the sort order wasn't correct. A week later they had it working but had implement it in the worst way possible. They had pulled in SQLite as a new dependency and were using that to sort the array. When I saw this I was furious. They were very cheap agency but after this they weren't worth keeping around at any price.

In conclusion, we ended up implementing websockets. I was the champion for this feature and after our CTO guided me in the right direction. We read through the spec for websockets, we came up with the right fallback strategy, and the product improved significantly. We didn't have to blindly add dependencies, we understood the technology we wanted tovuse, and we found the right fit for our needs.

What is the weirdest bug you've ever dealt with?

I was asked this once at an informal interview and didn't have an immediate answer. I told a story about the time I was trying to debug RF signals and the ultimate cause was that there was another remote using the same hardware ID. It was extremely unprobabel, but it turned out to be the cause.

Since then I've had more time to reflect on how I would answer that question if I was asked again.

The weirdest bugs I've ever dealt with all have a common theme, distributed systems. In my last two startups we have used distributed architectures. This has caused somewhat simple tasks to become very difficult.

In the old days of a monolithic app server (Django, Rails, PHP) all the logs were in one place. As you scaled you would need to add a load balancer and then spread the requests over multiple app servers. Reasoning about the path an HTTP request took was still fairly simple. At this level a relational database is perfectly able to handle the load.

In a distributed system, HTTP is used not just for the inital request but also for communication between servers. This means you now have to deal with internal HTTP responses and they have different meansing from the external request/response that you're used to. For example, what to do when an internal call to the cache returns a 401, 403, or 502? Do the unauthorized/forbidden calls mean that the external's request is 401/403 or does this mean your app server is? What about 502 Gateway Timeout? You now have to deal with timing and latency between internal services.

Distributed computer and networking are a reality of life for most new projects. It is something we'll have to learn and deal with. In the meantime, it is also the source of my weirdest bugs.

HUVR Truck

Showing off our truck on a nice, sunny afternoon. Shot with a Phantom 3 Pro and an Inspire 1.

Leaving Eagle Eye and moving to HUVRdata

I am excited to announce that I have accepted a new position with HUVRDATA. After an eight week transition I am thrilled to get started. I wI'll be their CTO and will guide them into becoming the leader in drone based inspections.

It was surprisingly hard to leave Eagle Eye and specifically, let the team know what was next for me. I lost sleep over this decision but we were able to make it through. I loved working with them and it was filled with great memories.

I've been working with Bob and Ben at HUVR since the start of the year. Thinks really kicked into gear when they closed a funding round in August. Right now it is time to mash the accelerator and I would have regretted letting them go on without me. This is an area I am very passionate about, have the skills to do it, and am fully on-board to make this successful.

Small Improvements Over Time

Improving takes time and patience and lots of near-misses. There is usually a narrow window in to capture the ideal scene and if you aren't in position and ready you will miss it. I continue to be 10 minutes late, out of battery or space on the SD card when I realize this is the shot I wanted.

In this case, I wanted to try the waters before going again the next day. The challenges of shooting downtime is that you attract too much attention to play around with expensive equipment as a crowd gathers.

Austin after Sunset

Pushing the distance out a little further and keeping it low across the water. Experimenting with Final Cut Pro X to get the titles and transition.

Out on the a Bat Cruise

I had a terrific time with @LSRiverboats getting this footage. This was captured in 4k@30fps but rendered to 720p for Vimeo. The initial shot was digitally zoomed in and the reset is right off the camera. Some minor exposure correction was done in post to make every match.

This was a lot of fun but extremely challenging to operate from a boat. There were two boats, fading daylight, bats and other groups to contend with. Thankfully everything worked out and I am happy with the results.

Getting Better

I've been trying for quite some time to get this shot. It was interesting because this was also my longest distance flight. I only went 1000ft' up the river away and it took a lot of squinting to keep it in sight.

Another Try at an Old Location

This was taken out front of the Austin City Power Plant and is using the new manual camera controls of the Phantom 3 Pro.

It is hard to go back to past locations and try to get a fresh take. The challenge is to that you are no longer looking at it with fresh eyes. This is made worse if you liked the results from previous tries.

I am challenging myself to throw away my previous work. All locations (except the Capital) are fair game. I am also going to not get hung up on having done the shot before. I will also shamelessly steal good ideas for others. I am still in the beginner phase and am in no position to be selective.

Smooth Sailin'

This is video from my first flight with the new Phantom 3 Pro. I picked it up earlier today and ran it through its paces. So far, it is a significant improvement from the Phantom 2 Vision +. The iOS app is the major differentiator. All the other features are in support of the app. The remote is the same as the Inspire 1 and feels much more refined. Common features can be done using hard buttons and there is less time spent pawing at the app. I am a big fan of physical buttons and not having to look at what I'm pushing.

I have (foolishly) upgraded my iPad Retina mini to beta versions of iOS 9. The DJI app isn't playing nicely with the new HD video streaming. I can't hold this against them.

Setbacks and Crashes

This is an older video that I hadn't posted yet. It is from a couple months ago and has nothing to do with the rest of the post.

I had a crash earlier in the week that has destroyed my Phantom 2. Almost immediately after take it it lost all power and unceremoniously fell from the sky. My first reaction was shock. I don't know if the battery wasn't seated all the way or it experienced a different malfunction.

After some time to think, my shock turned to gratitude. Gratitude that this wasn't over water, wasn't for a client, wasn't over a road, wasn't over a crowd. I don't fly over crowds but it is always tempting to push the boundaries.

Pushing the boundaries is what makes this emerging field exciting. Getting these images end of being expensive when you look at the output that gave from the money invested.

I can't wait to announce my next aircraft.

Getting more cinematic

The goal was to make a short compilation from several different shots. I wanted something visually interesting without spending hours on each shot. This was all done on just one battery. Except for the last shot, we only took one pass at each shot and didn't review anything. I had several more shots in mind, but will save those for another compilation. I enjoying improvising and the preasure of just starting the camera.

Sunrise Panorama

This is a composite image from several individual shots. It was taken a peaceful Sunday morning just after sunrise. I really enjoy Sunday mornings photo shoots. Austin is still recovering from its collective hangover, the roads are empty, and the sun isn't scorching hot, yet.

There is something more challenging about get a sunrise picture. Sunsets are easy, I can tell how the sunset will turn out on my drive home. There is a very little risk that the weather will change between packing up setting up. On the downside, flying at sunset garners more attention and then you are working home in the dark (with expensive equipment).

Most importantly, after a nice Sunday morning out flying you can always get breakfast tacos on the way back to bed. Let me know if you are interested in this image.

Making Progress

I'm starting to make progress with my ariel photography. The images and video are making me cringe less. I'm starting very basic and working up slowly. Instead of just cranking up fancy effects in LightRoom or adding some razzle-dazzle from Final Cut Pro. Right now it is about making something that works well as Desktop wallpaper.

This blog is the new starting point for a gallery. The goal is more about progress instead of a store of available prints. (That's not to stay I won't sell out and turn this into a store). If there is something that catches your fancy, fell free to reach out.

Putting in the hours

One of the great advantages to sharing your beginner work is that you are able to see rapid progress. The initial learning curve is usually pretty quick. Making a Vimeo and YouTube channel and posting to reddit provide quick feedback. A lot of the feedback will be critical of what you are doing, but this is all part of learning.

Once you reach a certain level of achievement you reach a plateau and growth levels off. This is usually a key area of attrition. The funnel tightens and many people are happy to stop getting better. It is important to identify what the next achievement will look like and keep moving towards it. For me, it was being included in an actual video production.

If you are interested, get in touch with Brent at In Focus Video

Picking a new hobby and starting over

I'm trying out a new hobby, that has been a casual interest of mine. This year is going to be the year of GoPro photography. It is a kind of weird idea, but the idea is really simple. I've bought the Hero 4 Black so there isn't a better model I can blame for my shortcomings. There is also a wide enough group of users posting their content that I have a significant yard stick to compare myself to.

What have I learned so far?

I've learned that it is easy to have an opinion about what looks good but it is very hard to produce something that I am proud to share. This is the challenge of creative work. My initial work is an embarrassment so far. It is well below the standard I want to produce but this is exactly the point.

What do I want to get out of this?

I want to produce work that I am proud to share. Also, I want to be more comfortable sharing my work. Expanding the things I am comfortable doing without much notice is a sign that I'm progressing towards being an interesting human being.

The same things that hold me back from dancing at weddings are the same things that hold me back with my creative work. This way I'll have something interesting to show people who are also not dancing at weddings.

Translating text in Backbone.js templates

We recently added Japanese support to our webapp. When implementing it, we decided on the most straight-forward method of converting it at run-time by adding a helper method to the view.

The first step was to extend Backbone.View to include a translate method (abbreviated as 't' to reduce noise in the template). It performs a simple dictionary lookup. It falls back to English if a suitable translation can not be found.

BaseView = Backbone.View.extend({
     t: function(str, skip) {
        // check if the browser/system language is Japanese
        var lang = 'ja'
        var dictionary = {
                'US': {},
                'ja': {
                    'Dashboard': 'ダッシュボード',
                    'Layouts': 'レイアウト',
                    'Cameras': 'カメラ',
                    'Users': 'ユーザー',
                    'Map': 'マップ',
                    'Installer Tools': 'インストールツール',
                    'Status': '状態'
                }
        }

        return dictionary[lang][str]== undefined ? str : dictionary[lang][str];
    }
});

And the specific view needs to be changed.

AddCameraMessageView = Backbone.View.extend({

gets changed to be

AddCameraMessageView = BaseView.extend({

The individual templates now can use the new translate helper method on any string.

<tr>
    <td>Device</td>
    <td>Name</td>
    <td>Status</td>
</tr>

gets changed to be

<tr>
    <td><%= this.t('Device') %></td>
    <td><%= this.t('Name') %></td>
    <td><%= this.t('Status') %></td>
</tr>

The next iteration will perform this translation at the build phase by writing a grunt plugin instead of at run-time. This will remove any rendering time penalty for our users.

Talk at Austin Python Meetup 8/14

I had a lot of fun speaking at the Austin Python Meetup. My presentation was on the current options available to those who want use python with embedded hardware. It was a great group of people and I got the chance to bring some toys.

Presentation is here