Dlexa: Hedge Dragon Using Mycroft AI

Collection of build logs and notes from Linux to Mycroft

Builds Links

Each of the Dlexa components is a separate build project with timelines and troubles.

Hardware Hacking:

As you read through the construction of the dragon you will find it involves a large variety of tools and components. If you were to buy all these at "retail" the cost would be into the many $thousands. If you were doing this as a commercial project you would complete the design then contract out the parts you need. This "design", "provision", "assemble" yields a predictable process that allows you to cost out and manage the project timeline. You can also predict time and cost to manufacture a 1000 units.

The "hacker way" starts with a goal and a junk box. As the construction proceeds there are many designs most of which fail because the junk box does not have the parts you need. This is a process of invention where you turn "what you have" into "what you need". It is more like what happens in a research lab where an upside down coffee cup is exactly the right height to aim the laser.

As a hardware hacker Dean has collected a "warehouse" of bits and pieces that can be cobbled into dragon parts. The "new" electronic parts are mostly very old designs that come from eBay (refurbished or recycled) or AliExpress (obsolete technology). If the need is for an A/D you can buy a MCP3202 Dual 12-Bit 100,000 SPS (sample rate) for $2 (free shipping) from China or the 24 bit 10,000,000 SPS DigiKey for $100. The hacker changes the design so the obsolete 12 bit part works.

In a commercial manufacturing operation the hardware hacker gets fired. In a research lab full of graduate students the hardware hacker is king.

Hedge:

"Dlexa" is homage to Amazon's Alexa. A 30 foot long and 5 foot wide. Hedge garden from 2021 to 2023:

Sept 2021 May 2022 Wire Frame Jan 2023 with Parts

June 2023 with DML

Click to see a larger picture.

2021: Two years after Escallonia was chosen as the hedge plant
In 2019 cuttings were "obtained" and started.
In 2020 an 18" channel was cut into the gravel, plastic lined, and with soil
By 2021 the plants were about a foot high (with 2' spikes)

2022: A frame was set up to lay out the dragon shape.
The plants added over a foot in just two months growth

2023 January: Shows the location of the various dragon parts.
On the far right of the picture (in pink) there are three Escallonia plants that make up the tripod for the Pearl. If you wire an escallonia to a bamboo pole and cut back any side growth it will be forced to grow into a tree trunk. An Escallonia can be forced into a "standard" or lollipop shape by doing this. For Dlexa we need the trunks to grow up about 4 feet, widen to form a plate that is about a foot wide then continue to grow the walls of a cage that holds the Pearl.

The pink ball labeled "Pearl" is about the size and position for the Pearl lexa which is a clear sphere with smart LEDs, a microphone and a camera.

Next to the tripod (in green) are several branches from the central line of plants that are forced to intersect the tripod at the base of the Pearl. These will be trimmed to form the dragon's right arm. Four claws (formed from landscape pipe) will extend from the right paw: two above and two below to grasp the Pearl.

On the left of the picture there are two light blue loops. These form the right back leg and foot. Two Escallonia plants were started just outside dragon's belly and forced to grow at 45 degrees toward the tail of the beast. These will be trimmed into a leg and the plants trained to send out runners to form a bushy pad. The pad will be trimmed into the shape of a foot and four claws (three forward and one back) finish the model.

2023 June: Installed DML Frame for the Dragon's Voice.
The hedge was cut for the DML/WiFi frame that supports the dragon's head.
Details of the electronics in the DML frame are in the Hedge Details

There is a grey-pink square with rounded corners hung inside the frame. This is the DML speaker. Testing showed that two 3 watt DML exciters driven by a PAM8403 Class-D amp created a 16 foot square space of constant sound starting at the edge of the sidewalk. A standard speaker is a "point source" that creates a cone of declining sound volume in front of the speaker. DMLs are an "area source" and create a rectangular patch of almost constant sound volume. The effect is quite remarkable.

The DML speaker is protected with a black bug screen to keep out the wind, weather and stray escallonia branches.

A shelf below the DML has a small block with 4 power lines each providing 24 volts at 2 amps or 200 watts to operate the dragon.

On the DML is an inverted clear bowl. This protects the SE567 Wifi router that supplies a strong signal for all the microcontrollers that provide sensors and cameras as well as lighting, sound and motion effects.

2023 Plans:
With the Escallonia adding about 18" in June and another 18" in August before Fall 2023 some of the body shape can be trimmed into a dragon. The tripod will be strong enough to hold the Pearl. Power and electronics in hub on the DML frame shelf can now be extended to add lighting effects in the body and Pearl.

Linux vs Windows for a Service Computer:

So why use Linux and not Microsoft Windows? The main problem with Windows is constant updates that require shutdown and restart.

Very little of the Linux code is in the inner rings of the operating system. Other than hardware changes it is rare for a linux system update to require a restart. A Windows system needs restarts about once a week.

Windows insists that "up to date" means your OS is identical to every other desktop on the planet. Many applications need parts of their code deep into the inner rings of the OS to perform well. This means an update to a function you will never use will require a restart.

The best way to manage a large service application (like home automation) is to have "function specific" computers. These are typically "headless", that is, without a screen, keyboard or mouse. Once they are configured they can run for years without needing a restart.

It is bad enough that Windows systems constantly go out of service for unpredictable software updates. Worse still with a networked application the restart of one node may put the entire chain of functions out of service. It is now rare but there are still some Windows updates that have the dreaded "Push any key to continue". For a headless system that means dragging the computer case out of its shelf and finding a screen, keyboard and mouse to "approve" any problem a partial update has found.

Often where banks of Windows systems are in a service application each computer is connected to a KVM (keyboard, Video, Mouse) cable. The individual KVM cables are connected to a switch supporting a single keyboard, screen and mouse that can select any system in a cabinet of computer racks. Often to solve the random restart problem many companies just restart all Windows systems at midnight. A more dangerous choice is to run unsupported versions so they are never updated (WindowsXP in ATMs). None of these is a a good solution for our headless 7-by-24 lexa systems.

Ban Windows? Absolutely not. About half of the 20 computers in my house are Windows. Most are managed with RDC (remote desktop) and cursed every time Microsoft forces one of those "push any key to continue" events.

For an office environment where you have a real keyboard and mouse nothing works better that Windows. For that reason Windows is used as the cornerstone system for documentation (like this website) and backup as well as microcontroller development and remote control for the many headless linux systems.

Linux, Hardware and Package Choice:

The four core systems: Mycroft, Sherlock, Watson and Dragon are loaded with Ubuntu 20.04.

Mycroft is large enough to do training. It is $$$ water cooled and has an Intel i9-9900K with 8 cpus and can run 16 threads. It has 16GB of memory and a 2TB SSD static ram disk. The graphics card is an RTX 2080 with 12GB of Vram and 4352 CUDA cores. 8 GB Vram is required for training.

The Mycroft system has STT, TTS, and Mycroft AI installed and has been tested with an apk loaded on a cell phone. The question was:
"What version of Mycroft am I running" and it responded:
"I am running mycroft-core version 21 oh 2, release 2"
"You are on the latest version."
This could only come from the software running on the Mycroft system.

Sherlock is a test to find the cheapest GPU deployment system. It is a refurbished Dell T1600 system with an Intel i3 2120 with 2 cpus and 2 threads. It has 8GB of memory a 128GB SSD static ram disk. The graphics card is a GTX780 Ti with 3GB Vram and 2304 CUDA cores.

1 GB of Vram is required to run a CNN application so Sherlock should be able to run an SST, a TTS and Mycroft. The refurbished system cost $250 and the GTX780 Ti (3GB CUDa 3.5) cost $150 so if it works a high school could deploy a lexa AI for about $400. A new GTX1050 costs $250 (4GB CUDA 6.1) so a system cost of $500 (better bargain: new GPU on old CPU).

A note on package choice. Coqui STT/TTS is reported to be the best of the open source (free) applications. It was based on Mozilla STT/TTS but tests faster and is easier to train. An attempt to use the pre-build the Coqui STT failed on the GTX780 Ti because Coqui uses the (newer) PyTorch CNN to CUDA interface. PyTorch was released in January 2018 and requires CUDA 6.0 or higher so it will not run on GPUs released before 2016. GTX780 Ti was released in 2013 runs CUDA 3.5 so it will never be compatible.

Mozilla STT/TTS uses the older (more widely used) TensorFlow (by Google) interface (released 2015) that runs on CUDA 3.5 or later so it will run on the Sherlock GPU CUDA Versions for Nvidia GPUs.

TensorFlow can be recompiled to run on CUDA 3.0 so it can support all Kepler (10 year old) GPUs [GeForce 660 or later, GeForce 730 or later, Quadra Kxxx].

TensorFlow will also run in a "noGPU" mode but the smallest supported GPU will run 6 times faster than CPU only hardware . The NoCPU version runs as long as the CPU has AVX in the CPU instruction set (after 2009).This means STT/TSS could be run on very cheap hardware so it is worth trying (maybe for Worms).

There are instructions for recompiling TensorFlow to use the ancient SSE (1999) precursor to AVX but the 20 year old hardware will have slower bus and memory speeds so the result is hardly worth the effort.

Watson is a test to find the cheapest ($150) non-GPU deployment system that supports AVX. It is a refurbished Dell 3020M system with an Intel i5-4590T with 4 CPUs 16GB of memory a 128GB SSD disk. This will test TensorFlow without a GPU. If it finds no GPU it defaults to AVX instructions. For some applications the noGPU operation (TTS taking several seconds to encode a short sentence) may be fast enough (speaking news or a story).

Dragon is an ancient Dell 755 (1999 vintage with no AVX) has Intel Core 2 single core CPU that can run two threads. It has 4GB of memory and a 250GB (rotating) disk (bought 10 years ago for $100). Mycroft AI should run on it as long as newer faster servers do the STT and TTS parts that need CNN.

You can in fact let Google do the STT and TTS parts but that means all the conversations are being recorded by Google (and potentially all the three letter agencies: NSA, CIA, FBI). Or you can network to a large system in someone's house. For example the Mycroft computer in Dean's house (with some careful router changes to expose STT and TTS ports) could support an entire high school lexa club.

In any case if the old Dell 755 runs Mycroft AI then any old pile of junk found at the side of the road (if it boots) will run Mycroft AI. Might even try a TensorFlow SSE experiment on it.

Dragon is running the Apache HTTP Server and hosts this dlexa.ca website (so not completely useless).

STT:

Installing the Speech to Text component.

TTS:

Installing the Text to Speech component.

Mycroft:

Installing the Mycroft AI glue that connects STT to TTS.

Voice Selection:

Choosing voices for Pearl, Dlexa, and Worms.

Mycroft Skills:

Giving the three hedges creatures their personalities.

Next:

If you are here for just the build walkthroughs - that's great.

It means you are already building your own lexa. If you have some time use the tabs explore the history and background for the Dlexa project.