I containerized my ChatGPT news app, and deployed it in Amazon Lightsail. You can give it a try here: https://container-service-1.2h766bqgeo2a8.us-east-1.cs.amazonlightsail.com/. I’m taking a news feed and giving that to ChatGPT using their API, and also asking ChatGPT to give me a (hopefully) humorous sales pitch for a classic “advertised on TV” product, using the news story as inspiration. My idea was to test out the concept of using generative AI in, for instance, a business application that creates timely sales pitches.
You could direct ChatGPT to provide serious sales pitches, but I asked for funny ones. The result is something like the Onion, except it’s completely machine generated. For example, when an ET Online article described how Kim Kardashian sported a pink suit that made her look like a Power Ranger, my web site generated this story: “In a hilarious turn of events, Kim Kardashian recently moonlighted as a Pink Mighty Morphin Power Ranger at the launch of the Navage Nasal Irrigation System. Kardashian donned her pink suit and posed with Navage, a device that uses a gentle saline flush to cleanse your nasal passages and relieve congestion. With Navage, you can put the power of the Pink Ranger in your pocket and blast away your sinus problems!”
Amazon Lightsail is useful for small sites like this one. I was expecting to be able to export my container from Docker on my Mac into a file and upload it through the AWS Console to Lightsail. That’s not how it works, though. You need to start a container service in the AWS console and then use the AWS command line on your Mac or PC to extract the container image from Docker and push it to Lightsail, as AWS describes here. Once it’s up there you can start it from the console. It costs only $7 a month to run a container using the smallest (Nano) server.
I’ve updated the code in Github if you want to try it.
I made a small Flask web site that reads the latest news feed from CNN and uses the ChatGPT API to generate a (hopefully funny) sales pitch for a product
Most people have tried out the web interface to ChatGPT, and been impressed by its capabilities. But how might you use it in your business? There’s an API for ChatGPT so you can build this into any application you want. I tested it out by building a small Python Flask web site.
My idea was that you might want to use ChatGPT to generate topical sales pitches for your products. So I grabbed a news feed from CNN and picked a few classic humorous products, like the Ginsu Knife and the Fishin’ Magician, and tried to get ChatGPT to help me generate web site content. I used the news stories to build a prompt for ChatGPT, and asked it to summarize the stories and segue to a sales pitch for one of the products.
You can see my code in Github. I started with a Flask blogging tutorial on the Digital Ocean web site, and modified it as needed. The app consists of just one Python file, which reads the news feed and makes the calls to the ChatGPT API, plus some HTML files, and a CSS file (these were mostly unchanged from the tutorial app).
You can get idea of the results below.
Sometimes the content ChatGPT generated was hilarious, and other times it was a bit generic. At one point I got this one, which was pretty funny:
This news story discusses how President Donald Trump was indicted on two charges related to hush money payments made to adult film star Stormy Daniels. Trump is accused of breaking campaign finance laws by paying off Daniels in order to silence her about an alleged affair. Why should you buy the K-tel Fishin’ Magician? Because with this device, you can cast your line and catch a big fish like President Trump did with Stormy Daniels – without the legal consequences! Don’t get caught with your pants down – get the K-tel Fishin’ Magician and reel in the big one!
Of course this is just a fun example that I created, but as you can see, I’m using ChatGPT to generate topical content in a fully automated way, and this would (hopefully) be useful to a business that needs to tailor a sales pitch on the fly. You could imagine feeding in a person’s LinkedIn profile, or their company’s profile to obtain even more tailored text.
In the course of trying this out, I learned a few lessons:
The API is quite expensive. OpenAPI gives you a trial, which contains $18 in credits. They only give you a couple of days to use them however. So I found myself signing in through other accounts so I could keep playing with it. I had hoped to deploy this on a web server so others could try it, but it looks like that could cost a fortune. I’m hoping Google Bard will cost less and perhaps I can switch to it.
You pay by the token. OpenAI says that “you can think of tokens as pieces of words, where 1,000 tokens is about 750 words.” I was using the “Davinci” model, which costs 2 cents/1000 tokens. They have more expensive models, like ChatGPT4, which costs up to 12 cents per 1000 tokens. Since I was summarizing a lot of news stories, I found that by just testing my site I was blowing through a lot of tokens.
I was surprised that the API didn’t behave like the Chatbot web site OpenAPI provides. It appears you need to keep reminding ChatGPT what it said previously. I was trying get ChatGPT to generate a headline for a previously generated news summary, but it kept generating headlines that seemed to be completely unrelated to the previously generated text. So I changed my code to take that text and feed it back to ChatGPT, and told it to consider that input and generate a short, headline-style summary. That worked much better. The only problem with that is it seems to consume more tokens because I believe you pay for both “prompts” (your input text) and the output text. So if you’re using the API to carry on a conversation it’s going to get more and more expensive the longer the conversation goes.
You may have seen some people having a laugh because they managed to get GPT-3 (a massive and well-hyped ML model that can generate human-like text) to tell them that the world’s fastest marine mammal is a sailfish. I assume that’s a problem with the training data, more so than the approach GPT-3’s inventors used. GPT-3 is trained from a huge “corpus” of text that resulted from crawling the web for eight years, plus a lot of books and Wikipedia. So sure, you can trip this thing up on some subjects because it’s only as smart as the Internet. GPT-3 doesn’t really have “knowledge” but rather it generates plausible text based on it’s training, your input and what it recently generated. So it’s not correct to say it’s “stupid” because it doesn’t know that a sailfish isn’t a mammal. This was just some plausible human-sounding text it generated based on those factors, so most likely there is something on the internet led GPT-3 to generate text implying a sailfish is a mammal. But anyway, I played around with GPT-3 and found it to be pretty mindblowing.
OpenAI just put out a Chatbot interface to GPT3 so it’s easy to try it for yourself. Check out this conversation I had with it. I tried to “trick” it by asking about how many members of the Beatles there were. Everyone knows there were 4 members: Paul, John, George, and Ringo. Except there weren’t. There was also Stuart Sutcliffe and Pete Best, who were early members who didn’t quite work out or had other interests. And what about their producer, George Martin? Some people called him the “fifth” Beatle. I found that the Chatbot incorporated my prior questions and answers, and appeared to get smarter about this subject pretty quickly. Impressive!
Using unsupervised learning and logistic regression on real NHL data to determine what it takes to win in pro hockey
I’ve been working on a new side project, and I’ve posted some of this on Kaggle. I’m trying to use NHL player and team data to determine how to build a winning team. I used k-means clustering, PCA, and logistic regression to accomplish this. My sense is that NHL GMs still determine how to build a team in a smoke-filled room, without a lot of analytics, although maybe I’m wrong. But looking on Kaggle or even some web sites dedicated to hockey analytics, I’ve never seen something similar, so I think it’s pretty innovative work (if I do say so myself).
The idea of this analysis is to use NHL player and team performance data to determine the characteristics of a high-performing NHL team, i.e., what mix of players do they have? With a model like this, we can look at the roster of a team, and determine their chances of going deep into the playoffs, and we can also make suggestions about what kinds of player moves they must make in order to become a great team.
I decided to use regular season player statistics, because this ensures we have a lot of data on every player in the league. However, to determine if a team is a high-performer, I used playoff performance, not regular season performance. It might be worth trying to use regular season data, but playoff performance is the ultimate test, and arguably the nature of the game changes in the playoffs. So I used playoff wins as the performance measure for a good team.
My approach was as follows:
Use k-means clustering to create clusters of similar players. I used three different groupings–forwards, defenders, and goalies, and created clusters within each grouping
Use those clusters to characterize the 32 NHL teams for each season between 2008/9 and 2021/22. How many players from each cluster do the teams have?
Then use logistic regression to see if we can find relationships between teams having certain mixes of players, and going “deep” in the playoffs. e.g., What’s the relationship between having a forward from cluster 3 and going deep in the playoffs? If we know that, we know how important it is to have such players on your roster, and we can draw conclusions regarding the player moves teams should attempt to make in order to improve.
In this post I’m going to describe how to build a simple Tesla Tracker that runs in AWS (see my Github). It’s some Python code that uses an AWS Lambda to gather data from your car, and store it in a DynamoDB database table. Because it’s fully serverless (of course there are servers under there somewhere, but as far as you’re concerned, you don’t need to think about them), it’s amazingly cheap to run. For this application, Lambda costs me zero (given the relatively small number of Lambda invocations I’m making here). The DynamoDB table costs me about 59 cents/month (choose the lowest value you can for “provisioned read and write capacity units” to keep the cost down), and AWS Parameter Store costs me 5 cents per month. So you can see how ridiculously cheap it is to run a serverless application like this in AWS.
In a prior post, I described my first attempt to build a simple means of gathering telemetry data from my Tesla using Tesla’s owner API. I briefly had some fun watching my AWS DynamoDB table fill up with readings showing the location of my car, the temperature, battery level, etc. However, after a while Tesla locked me out of the API, probably because I was creating too many tokens and I wasn’t storing and reusing them properly (lesson learned). Worse, Tesla uses the same API for your mobile app, which is also your key to the car, so that was locked out too!
It turns out that Tesla has now changed the authentication approach on their API to use OAuth2, so I needed to update my approach, anyway.
As before, I’m using:
AWS Lambda as a place to run my code, which wakes up periodically and gathers data from my car, and
DynamoDB, Amazon’s managed No-SQL database, to store the data that’s retrieved.
I made a couple of key changes to deal with the Tesla’s new authentication approach, and also make sure I was storing and reusing tokens properly:
Use the teslapy API. This is a Python library for calling the Tesla Owner API. The volunteer who is maintaining it gives you some good instructions on his Github about how to manage authentication with the Tesla API, and persist the API tokens. Follow the instructions on Tim’s Github regarding how to get your first token. Then you can paste it into AWS Parameter Store (see below) and teslapy should manage renewing the token for you after that.
Use AWS System Manager Parameter Store as a place to persist my tokens. Theteslapy Github shows you how to store your tokens with SQL Lite. But when you’re using Lambda there isn’t really a good place to store the SQL Lite DB between Lambda runs, unless you want to do something like put it in an S3 bucket. But since we’re using AWS, let’s use the native AWS tool for this, not SQL Lite. AWS Parameter Store is tailor-made for this kind of thing. It’s not difficult to use, and you can encrypt your stored credentials. You can look at how I did this in Github.
In my prior post, I included some details on how to zip up any Python libraries that your code depends on (in this case, teslapy) and upload them to Lambda. This can be a little bit tricky (I got it wrong the first few tries) but this blogger gives you exact bash commands to use, so if you follow those instructions carefully it’s not so bad.
To use AWS System Manager Parameter Store, just go into the AWS Console and create a new parameter. I called mine ‘My_Tesla_Parameters’. So that’s where I’m going to store my API tokens. There are actually a few parameters that teslapy wants to persist. So rather then separating them all out and storing them separately, I just stored them as one big JSON. There are 2 kinds of parameters in AWS SMPS–standard and advanced. It turns out that shoving all of my parameters into a single JSON put me over the size limit for standard, so I had to move up to advanced, which costs me 5 cents per month. My lambda code has functions that get and store the parameter JSON in AWS SMPS using the AWS boto API. That way, I get the Tesla API token before making my API calls, and put it back before terminating the Lambda (in case the teslapy API decided to update my token based on whether it found that it was about to expire). Another option is AWS Secrets Manager. For my application, though, Secrets Manager looks like overkill because it seems to be most helpful if you’re rotating credentials. That’s not really relevant here–I’m just storing my Tesla tokens, and teslapy is going to help me refresh them every 45 days or so when they expire.
You need to give your Lambda an IAM role that lets it access the various AWS services that it needs, namely, DynamoDB and Parameter Store. When you create a Lambda, AWS gives you the option of creating a new role. That’s a default role that the Lambda will assume, but by default it won’t have access to anything except Lambda. So you can just let AWS create that role for you, but then you need to go into IAM in the AWS Console and find the role and edit it. Add two ‘inline’ policies to allow access to DynamoDB and System Manager Parameter Store. It’s good practice to give those policies the least privileges necessary (i.e., just write access to DynamoDB, and just Put and Get privileges for System Manager Parameter Store).
You can use AWS Eventbridge to wake up the Lambda every so often. Below, you can see a few data points in my DynamoDB table, as well as a map showing some of the points captured during a recent trip, and stored in my database.
My plan is to let this run for a while and possibly try to use the data for Machine Learning, or some kind of dashboard application. Things look good after a few days. I haven’t locked myself out of my car yet!
This may or may not be useful to people who are working on projects and need a GPU for cheap. I use AWS at work, but I’ve found their approach to ML is overkill for my basic needs. i.e., you can use AWS Sagemaker’s managed notebooks but connecting it to your code and providing a place to put model checkpoints requires (I believe) hooking it up to an S3 bucket, providing the right bucket IAM permissions, etc. Or you can get an AWS instance with a GPU, but similarly you need to connect it to S3. It’s not that difficult, but it’s some overhead you might not want to bother with if you aren’t too familiar with AWS.
I find Google is easier to work with in this respect. You can use Colab. You get a GPU sometimes, but if you don’t want Google to kill your session at an inconvenient time, you need to pay up for the Pro or Pro+ subscription, which are $10 and $50 per month, respectively. Pro+ is actually a good value. It might be more than you want to spend, although it’s nice because you can run multiple sessions at a time and you get access to good GPUs (e.g., NVIDIA P100).
Another option is a GCP instance, so I’m using that too. With a new account you can get $300 in credits. Fire up a GPU ML instance like this:
You can get a low end GPU instance for about $300 per month. So your credits cover that. You will need to request a quota increase from 0 to 1 GPUs, because the default quota is 0. If you try to launch the instance without this, you’ll find it doesn’t work. So ask for a quota increase to 1 GPU, as follows, and a little while later you should be able to launch a GPU instance.
You can go for a higher-end instance ($1000 a month or so) but just be careful to stop it when you aren’t using it, and delete it once your project is done, just to be sure. I find GCP is much easier to use for training than AWS. You SSH into the instance by clicking the SSH button next to it on the VM Instances screen under Compute Engine. Then, from the browser-based SSH window, you can upload your code and any data you need. No need to connect it to a bucket and work out IAM permissions for accessing it.
GCP’s deep learning images come pre-installed with a lot of python libraries, but if you need more, just pip install them, and you are in business! You can be up and running in just a few minutes, and if you’re careful you should be able to do this for free (within your GCP credits).
I recently bought a new toy which I think really represents the future of AI. It’s the Coral chip, from Google. Google is a leader in Machine Learning with their Tensorflow platform, and now they are pushing down to the “edge” of machine learning with their Coral chip. This is important because if we can buy low-power, cost-effective chips that can speed up machine learning inferencing, we can run ML at a reasonable price on very low-powered devices such as IoT devices that can’t run ML in the cloud (e.g., because they don’t have the connectivity, or because they need to make quick, autonomous decisions using ML).
Google calls the Coral chip a TPU, for Tensor Processing Unit, because it’s a custom chip that’s designed to natively process tensors, which is how data is represented in neural networks. Because it’s designed to do exactly this, it’s way faster than just using a standard CPU when it comes to doing the math required to do neural network ‘inferencing’, i.e., making predictions using an ML model.
Buying and installing the Google Edge TPU
I ordered the Coral USB accelerator from Mouser. If you like machine learning, it’s a lot fun for just $60. Of course the main idea of the Coral is that hardware makers will build it into their products, but with the USB accelerator, you can easily add it to your Pi. Here, you can see my Pi with the USB accelerator attached by USB.
To install the Edge TPU on your Raspberry Pi, you follow these instructions. They recommend using the ‘standard’ runtime, but they also provide a “maximum operating frequency” runtime, although they warn that if you use this, your USB accelerator can get very hot, to the point where it can burn you! And you need to accept their warning before they even let you install the runtime software that makes the Coral run that fast.
Well, that sounded like too much fun to ignore, so I tried both the standard and the high-speed modes.
The Coral chip only works with Tensorflow Lite (TFLITE), which is Google’s version of Tensorflow designed for processing at the edge. In a prior post, I used my library of photos taken by my Raspberry Pi of birds at my bird feeder. I sorted the photos into folders named for the relevant species of bird, and fed it into the TFLITE model maker to build a TFLITE model. Then, I took that model and compiled it for the Coral Edge TPU according to Google’s instructions. You can see my Python notebook for training and compiling my model in Github. I got most of this code from examples provided by Google, but if you’re trying to train your own model and run it on a Coral chip, hopefully this will be useful for you. Once you have the model (and the labels in a .txt file), you can use the sample Python code that Google provides to run a TFLITE image recognition model on your Pi. I made a couple minor modifications to this, but nothing much.
Below is a video of a bird being recognized by the model without the Coral chip. The important number to look at is the time to run the TFLITE model, in milliseconds. You can see it takes around 140ms each time we take a grab from the Pi’s video stream and feed it into the model and run it.
Now let’s check out a Goldfinch being recognized by the TFLITE model running on the Edge TPU. Wow! Each call of the model executes in under 7ms! That’s 1/20th of the time it takes to run on the Raspberry Pi’s (low powered) ARM processor.
Just for fun I tried out the “maximum operating frequency” runtime, which Google warned us about. It gets the execution time of my ML model down to around 5ms or so. So that’s around a 25% reduction in execution time vs. the standard Edge TPU runtime, but compared with the time to run the model on the Pi’s CPU (140ms), that 2ms savings is probably not worth the extra power consumption in most cases. If I get a chance I’ll measure the temperature of the TPU after running for a while on the “max” runtime. Maybe I can cook an egg with it.
I’ve been interested in running my bird feeder image recognition model fully on a Raspberry Pi. I’m using a Pi model 4 with the High Quality (HQ) Pi camera to take pictures of the feeder, but so far I’ve only run the Machine Learning model for image recognition in the cloud. Using the fast.ai Python library, I was able to get about 95% accuracy when recognizing 15 bird species that are typically seen in my area.
But the process of sending the image to the cloud, calling the image recognition software, and getting a response back to the Raspberry Pi could take a couple seconds. That’s not fast enough if you need to take a quick action based on the result. For example, I’ve toyed with the idea of recognizing squirrels near the bird feeder and warning them away by squirting them automatically with a hose. To quickly activate the hose when the camera sees a squirrel, we’d want to run the ML algorithm on “the edge”, i.e., on the Pi itself, not in the cloud.
Tensorflow Lite, by Google, looks like an interesting approach to running ML on the edge. You can get in the game pretty quickly with some tutorials and sample code that Google provides. I took my library of bird pictures, with labels, and used a Python notebook that follows Google’s sample code. The results (around 75% test accuracy) aren’t as strong as those I achieved with fast.ai (about 95%), but they weren’t bad given the limitations of the basic TFLite approach that Google recommends.
Google provides some image recognition models that you can use for transfer learning. Actually they have a whole bunch of these. These models come pre-trained by Google on basic image recognition, so you can use them as a starting point. They have so many that it’s difficult to know which one will be best for your situation without just trying it. I ended up with EfficientNet, which is the default recommended by Google. One issue is that (as far as I can tell) Google’s image_classifier.create API works best when you just leave the weights in EfficientNet’s pre-trained model alone, and only train the neural network layer that you add onto EfficientNet in order to classify the photos in your situation (in my case, determining the species of bird in the photo). If you’re willing to invest more time, I think you could use Keras to try to freeze EfficientNet for a few training epochs, and then allow EfficientNet’s weights to be trained after that. This is what fast.ai lets you do very easily, so I think that’s one reason the results from fast.ai are much better than with TFLite. The other issue with TFLite (at least as I implemented it) was that I was quantizing the model down to 8 bit floating point numbers. You do that to get a speed improvement so that you can run inferencing (i.e., classifying the bird’s species) on your low-powered machine, but you’re making a trade-off that limits the accuracy of your model.
Still, I was able to run the model on the Pi and get some correct image classifications. Here, I’ve included some videos of the model running on my Pi, and recognizing a cardinal and a red-bellied woodpecker. For these, I trained my model using the Python notebook in my Github repo, and then downloaded the model and label files to the Pi. From there, you can just use the sample Raspberry Pi image classification code that Google provides to run the model against a video stream from your Pi camera. The code will run TFLite inferencing against a series of shots from your camera, and it will annotate the video steam with the result. You can see the label and the probability that the model assigns to the classification right on the video. Now these particular species are colorful and relatively easy to identify. When it comes to less distinctive birds like sparrows and finches, the TFLite model does noticeably worse than my fast.ai model.
Given my experience with TFLite, I’d say it will work well to give you a quick answer to a simple image classification problem, running directly on your Raspberry Pi. So I think it could be viable for, say, telling a squirrel from a bird. But if you want to accurately distinguish between many species of birds, I think you’re better off using some more computing horsepower and using a more powerful ML library, like fast.ai.
If you read my previous post on my efforts to create an accurate ML-based bird recognition model for use with photos from my bird feeder, you’ll know I was able to obtain validation accuracy of up to 95% using images mainly from the Caltech-UCSD bird image library (after cutting the number of species down to the ones I tend to see at my feeder in the Northeast US). However, my accuracy in identifying the species in actual photos from my bird feeder seemed to be much lower than 95%. I believe that’s because:
The Caltech-UCSD images may have been used to train Resnet, which is a “starter” model that I used to train my custom bird recognition model. That means that Resnet does well at identifying images from that data set because it’s already seen them before. But it doesn’t do as well when identifying images from my bird feeder because, obviously, it’s never seen them before.
Images from my bird feeder just don’t look much like the images in the data set, which I assume were taken by professional photographers, or at least by serious amateurs. Some of my photos are good (if I get lucky), but other times I get a photo of the bird’s rear end, or the bird is moving, so it’s a bit out of focus. Further, I think there’s some benefit to having images that are consistent. My bird feeder is consistently located within my photos, so you could imagine that the ML model is factoring the bird feeder out of it’s decisions regarding which label to assign to new photos, because the bird feeder didn’t play a role in any of the labels assigned to the training images.
So based on these factors, I decided to gather my own library of images and use these to train an ML model. This took a while, but as you’ll see, the accuracy I was able to achieve was quite good.
I use a Raspberry Pi 4 with a “High Quality” camera, with a Canon lens borrowed from my SLR camera, which lets me zoom in on the feeder from a spot inside my kitchen. I use PI-TIMOLO, running on the Raspberry Pi, to detect motion and capture a photo.
I’m using the Fast.ai Python library. I’ve found that it’s the quickest and easiest way to get good results. I’ve been playing around with Tensorflow-Lite to see if I can run the ML model on the “edge”, i.e., on the Raspberry Pi, not some high-powered system with a GPU in the cloud. While the results aren’t bad with Tensorflow-Lite (although not as good as with Fast.ai), it requires more fiddling with hyperparameters to get the best possible results. Fast.ai can figure out the best learning rate, for example, and can even adjust it automatically while training proceeds.
I used Google Colab to train the model. Colab gives you free access to a GPU, which is a specialized processor that cuts the training time of a model like mine by about 80%. If you use it too much, Google might cut you off and you can either wait a while until you can use a GPU again, or pay up for their Pro version, which costs $10/month.
Gathering Your Own Image Data Set
I put my Python notebook in Github, but I’m not going to include all my images. If you want to try this on your own you’ll probably want to gather your own photos. This wasn’t difficult, but it took some dedication (obsession?) in order to sort through hundreds of photos each day and put the good ones in folders, organized by bird species. Before taking on bird ML as a hobby, I didn’t know much about birds, so I had to ask for help in identifying the birds now and then (thanks, Steve!). By the end, my image data set had about 2000 images of 15 different bird species. I didn’t separate the various sparrow species–I just lumped them together–but maybe I’ll try that later.
In 9 training epochs I was able to get the results below. For the first 3 iterations, the Resnet weights are frozen, but in the last 6 epochs, these can be adjusted. The idea is that it’s not worth adjusting thousands of Resnet neural network weights early in the game, when your model is completely clueless. But once your model has been trained to a reasonable level, you can let fast.ai adjust the Resnet weights to optimize your results.
So with an error rate of 0.05, we’re at 95% validation accuracy. Given that Resnet’s never seen these pictures, we hope that this accuracy will carry over to inference on real images captured by the Rasperry Pi, day-by-day. Did it? Yes!
As an example, you can see that in my Python notebook I ran inferencing on 15 additional images, and it got only one wrong. You can see an example of this at the left. Below, you can see the model’s results. It thinks this is a cardinal with very high certainty (0.999), which is correct.