Dlexa: Hedge Dragon Using Mycroft AI

Jobs Jobs Jobs

Staff for a lexa Project

Building a lexa is very labour intensive. A lexa project has a lot in common with shooting a movie with lots of different jobs. It would be like creating a biography for a interesting real or fictional person.

You need a script of dialog for the character (creative writer), that is authentic to that character in a conversation (script editor), and a set design for the place where you would meet (interior decorator, carpenter, painter).

An animatronic lexa would help to fill out the character. That means a body (sculptor), with moving parts (mechanical and electrical engineer), and a connection to current events (journalist, communications engineer).

All that would give a lexa personality but it needs to listen (STT speech-to-text: math, computer science, linguistics), search a database (math and data base management) for an appropriate response (psychologist), then provide that response (audio engineer) in an appealing voice (TTS text-to-speech: math, neutral net trainer, voice coach) and appropriate animations (anatomist, psychologist).

So it takes and university plus a technical school plus a vocational school to build the talent need to make a lexa.

Personality Script Editing and Evolution

Start off with an interesting personality is key because we already have un-interesting lexas (like Alexa) and they need a real job (like retail sales) to avoid the off switch. The best starting point is a popular fictional character.

Interesting means character quirks. The main characters in video games would be good models: GLaDOS from Portal, Mario from the NES games, Master Chief from Halo, Ellie from The Last Of Us. A personality you would like to sit down and talk to.

Someone who had played the Portal game might have a feeling for what it would be like to sit down and have a conversation with Cave Johnson. You can get all his voice lines from the internet. But what if a visitor asked Cave whether he'd use a anti-stick or seasoned cast iron pan to sear a steak.

The answer must come from things that Cave would say no matter how bizarre. He would never say "does not compute" or "I did not understand that". Doing that would instantly break the feeling you are talking to a person.

So an in-character response could be [key word: question asks for a choice between alternatives] from real Cave dialog:
"The testing area's just up ahead. The quicker you get through, the quicker you'll get your sixty bucks."

This would generate an error on the log. A script writer would then invent an in-character Cave Johnson response that would be used the next time similar key words showed up in a visitor conversation:
"We tried both on our test subjects and neither was satisfactory. Our science guys figured out how to use a flame thrower."

Notice this is perfect job for a person in love with psychology. Dlexa may be math but a lot of her depends heavily on the arts.

Form and Appearance

Alexa is often a disembodied voice (in your car) or a smart speaker (the furry can in your living room) or a microphone (the stick you use to call up a movie on your streaming services).

A lexa needs a more creative form. The only constraint is it must be large enough to contain all the required electronics. It could be a robot (from Star Wars or WALL-E or Wheatley from Portal) or fantasy creature (a Cheshire Cat or unicorn or like dlexa a dragon) or a large plush toy (panda or bear or rabbit).

Kid's toys are probably the best models because the non-smart versions already have a natural charm. A large plush teddy bear triggers some kind of ancient wiring in the human brain that makes it huggable. A smart teddy bear with a Cave Johnson would have the impact you are looking for.

So you've decided on a 6 foot tall teddy bear. Now you must decide on materials. If it is an inside installation you could probably start with fake fur and fiber fill and a rigid interior box for the electronics and mechanical parts. Someone in a robot club could build the whole thing and you just have to sew up a fluffy cover.

An outside lexa is a bigger challenge. A plush teddy bear would look very sad after snow and rain and become a home for slimes and rodents. Outside you need a hard shell with an easy clean surface. This requires a sculptor with some serious metal working or plastic forming skills. The shell has to be designed first to hold the electronics and mechanical parts then build to that design. Changing your mind after the shell is complete may mean starting over with a new shell. You probably need Disney Animatronics shop skills to make this work.

Beginners should start with an indoor lexa.

Dlexa is an alternative for an outdoor lexa: a living topiary hedge plant. This means you need years to grow the body. The advantage is that the "finished" form does not have to be perfect. The body is constantly growing so you can change your mind and have the plant grow a new body in a year. The most important factor is that no one can build an identical copy. If a thousand topiary dragons show up in your neighbourhood every one of them will be welcomed. A second plywood bear may attract a zoning violation.

Engineering Skills

Now we get to the Math part. If you want the lexa to move its head or arms there are equations that define the movement of levers and hinges. Get those wrong and motors won't have enough force to make those efficient or reliable. Ohms law determines how much power you can deliver to things like motors, speakers and lights with melting the wires. Signal transmission math determines how long a data wire can be before the data pulse is degraded and noise corrupts it. And of course the lexa mind is a huge computational math problem.

The engineering skills for the lexa body are more traditional: mechanical to get the animatronics hardware working, electrical to provide power and communications for motors, lights, speakers and computers, and computer engineering to program the microcontrollers that support sensors and actuators. For an outdoor lexa all these have to built endure the weather (like mechanical) or hide from weather (like electrical).

The engineering skills for the AI that is the lexa mind are more like those of a research scientist. Building a lexa requires working with a lot of bits of technology that are very new and not well documented. The "leading edge" of technology is often called the "bleeding edge" because things break all the time. Like most research projects things fail 95% of the time. This can be extremely frustrating. In this area great math skills are not enough: you need patience, debugging skills and a sense of humor because Murphy's Law may be the most difficult problem to solve.

How Smart?

This is where the technological is advancing incredibly fast. Dlexa will start with Mycroft AI which is a platform that can perform at a level that appears almost as good as Alexa. The appearance is, of course, an illusion. Alexa runs on a vast room of servers and has thousands of cleaver researchers advancing her capabilities and many more thousands do the script editing function.

The only advantage is Alexa is built to efficiently sell stuff. Dlexa however is built to entertain. Dlexa will make mistakes in her attempts to entertain that would be the basis for lawsuits if Alexa tried them. It is the freedom of the artist over the button down collar of the office worker. Dlexa can win her place because Alexa is not even allowed to play the "entertainment" game.

There is a new smart application emerging ChatGPT that is currently only available by subscription.

ChatGPT is not a replacement for Mycroft. It uses an extremely good model of a language (like English) to cobble together paragraphs of text based on a subject request. It would make a very good back-end for Mycroft "does not compute" states. If there is no Mycroft skill for a subject and Mycroft's internet search is not finding good answers the conversation could be transferred to ChatGPT as a filler.

This is less useful in most Mycroft applications. In a home automation example if you ask Mycroft to turn on the swimming pool lights and you don't have a swimming pool Mycroft is "stuck". If you transfer control to ChatGPT it would carry on for hours describing the kinds of pool lights their relative cost and reliability. The proper solution of course would be to add a skill to Mycroft that just says "We don't have a pool."

The advantage of Mycroft is keeping your conversations private. You cannot run a local copy of ChatGPT and it requires vast amount of compute resources to do its magic, far more than would fit in a standard PC. The subscription model makes sense because those compute resources are only required for a tiny amount of time between question and response. This is less than a second in human conversation. Between those one second spikes of enormous demand the compute hardware has nothing to do. For a commercial venture the expensive idle time is consumed by having thousands (millions) of visitors (customers) asking for answers.

ChatGPT is a GPT-3.5 application. Currently (January 2023) there are few details on the hardware needed to support it the best guess is it needs at least five (5) Nvidia 80GB A100 GPUs just to load one model (chatbot) for execution. These cards fit into a standard PCIe 4.0 x16 slot and cost $20,000 each. A high end gaming PC ($5,000) might have three PCIe 4.0 slots so you are going to need a server level system (special build $20,000). Each card needs 300 watts of power so providing cooling for 1500 watts would be a problem.

It would be much cheaper to rent A100 GPU images from Azure or AWS. As a cloud service they cost about $3 per hour. For 8760 hours in a year the annual cost for an A100 would be $26,000 and a five A100 image just to run a ChatGPT bot would be $150,000 per year (including network and data storage costs). Note that this is just to run one model and as a rule of thumb training costs are about ten times execution costs so a system you can train (including staff) would be $2 million per year.

A subscription for one "chatbot" from OpenAI is about $100 per month and as a back-end to the Dlexa project would need about five (Dlexa, Pearl, Worms, two development bots) and cost about $6000 per year.

Even if the Dlexa project (a kind of home automation) could be built on a future version of ChatGPT that included home automation-like abilities it would be limited to the "vanilla flavour" offered by ChapGPT. Dlexa in the dragon hedge running on ChatGPT would be very much like the Alexa for your car or streaming services. Worse still its activities (like Alexa) would be visible to the three letter agencies (NSA, CIA, FBI) and may be sold to advertisers as well.

I am certain for a "craft" lexa like Dlexa using Mycroft AI will be the best choice for the duration of Dlexa's 5 year project plan.

Next:

Browse through the tabs on the top row. The topics are mostly independent of each other and not ordered in any way.