More LLM/AI Tooling
Much like the 90’s it is a wonderful time to be a software developer, tech person or coder, just that instead of the web, the current awesome thing (and it is awesome ) is LLM/AIs, along with them though there are tons of new things one needs to learn to do things like host your own LLM, make demos for your new startup idea or just tinker around and make a sentient AI ( a pet project of mine ), so while much of what I’ll talk about will age poorly, think of this as a snapshot mid 2024 of some choice AI tooling.
I recently wrote about HuggingFace and Langchain,not required as they
will also be covered here, but they go more indepth if you want to check
them before/after.
Get yourself a cutting edge LLM AI and run it in 2 minutes flat !
The easiest/fastest/no stress way to run an AI, and also a great introduction to tooling is via Google’s Colab (hosted notebooks), Unsloth (LLM acceleration/tooling ), using Meta’s llama 3.1-8B ( The AI/LLM ), how easy ?
# See here for GNU licence
# https://colab.research.google.com/drive/1T-YBVfnphoVc8E2E854qF3jdia2Ll2W2
# But basically 3 lines
!git clone https://github.com/unslothai/studio > /dev/null 2>&1
with open("studio/unsloth_studio/chat.py", "r")
as chat_module: code = chat_module.read()
exec(code)
Granted we are cloning a repo ( and you should always check the code it runs ), but after waiting a couple of minutes you get :
And you can keep chatting and developing on hte cloud for as long as you have colab credits (the free tier should also work but I pay like $10 a month which is plenty for experimenting and tinkering ).
Backend & Frontend (sort off)
The above demo is interesting because it provides one way of dividing responsibilities amongst different tools, the alternative being you coding everything from scratch which you probably don’t want or need, so it goes more or less like this :
The model/model weights are hosted at HuggingFace, Unsloth does a few things like importing the model and setting up basic chat templates, more importantly it also provides fast inference,fine-tuning, saving and other dev tools, the performance improvements alone are worth using it, and last but not least Gradio provides the UI, the stack looks like :
I recommend you look at the unsloth repo which should give you a starting point for expanding and experimenting your project, things like custom chat templates and fine-tuning, you might not need it but under the hood there’s pytorch and a few other ML libraries for advanced functionality, again the focus here is on creating demos, you could also browse gradios examples if you need other types of AIs.
Going Native...You could of course disregard external tooling altogether and
use whatever tools your favorite AI provider tells you, simpler and sometimes
faster at the cost of being dependent on their tools and stack, if this sounds
good both Google and Meta have you covered.
Locally ?
Who doesn’t like running things from the comfort of ones computer ? The thing is that LLMs are currently hardware expensive to run, I’d say the bare minimum is 16GB of RAM, disk space is more or less cheap, but you’d also need a GPU ( or multiple) and a speedy processor/s for things to run smoothly, so $$$$, once you have the hardware you have a couple of options:
- Ollama Is a pain free way to run some choice models locally and experiment with things like tools,chat templates and fine-tuning.
- Huggingface Transformers on the other hand is the defacto LLM library these days, but comes with a steep learning curve.
- Torchchat Just as I was pulishing this, Pytorch released a library that runs LLMs locally, on browser ( using Streamlit instead of Gradio )and other platforms, worth checking out.
The API life chose me
There’s a clear divide (at least in my mind) between folks that want to tinker,research and experiment with new tech and those that want to make money using the latest technology, a third type I suppose would be those that want to make money without writing a single line of code and/or pretending to use the latest technology, money usually follows in reverse order but I digress, using an API makes perfect sense if you want a demo for some commercial venture fast at the cost of well money.
- OpenAI + Langchain Should cover almost everything your LLM startup idea needs, you will still need and execution environment and frontend which should be covered by legacy web/app/desktop development tools (things like flask,react,node, etc,etc).
There's increasingly more and more AI/LLM open source tools and models,
why should you pay ? Well you shoudln't if you find something to your
liking that is Open source, but paid services and tools start to make
sense if you want some feature that is not available, need some business
reasurance like upfront costs and support or well you are short on time
and this is the fastest way to make your project.
Small Servings
So you got your LLM/AI thing working and ready for the world and prospective investors/customers to check it out, how do you serve it ?
- HuggingFace Spaces is the easy and low cost/free way, (still has a steep learning curve ), perfectly fine for demos and experiments, you can even embed your space on your site ! and make private spaces for those close door investor meetings. The only obvious downsize is that you tie yourself to HuggingFace; performance, terms of service, features, models etc,etc could and probably will change, not always for the better.
Big scoops
Another way to serve AIs is through big tech offerings, you might think that using them is a waste of time for your “small” project or experiment, but you pick up a new skill along the way that might pay the bills in case your startup fails, but also they each have some cool feature or characteristic and it might be easier to scale, do shop around, here’s 2 that are developer friendly:
AWS AI Doesn’t need an introduction, but is shaping up to do for AI/LLMs what they did for hosting and app development, I struggle with every interaction with their site/docs, but can’t fault the actual tools.
Azure AI services Same as AWS, all kinds of AI/LLM infrastructure running in the cloud for a price, usually offering free developer accounts/credits, so painless to check out.
Did I miss something ?
This LLM/AI tooling list is short due to a couple of reasons, firstly things are just getting started on the tooling front, and chances are these will be the incumbents ( plus a few more unknown stragglers ) and secondly I am moving apartments and well my world is cardboard and packing tape right now, so if I missed a tool or you have a favorite one I didn’t mention do leave a comment, you will be helping a tired AI dev/researcher that can’t remember in which box the bed screws are, maybe I (or you) should make a moving AI assistant and some of the tools discussed here should come in handy.
Thanks for reading !