Monday, December 8, 2014

Cricket part 2: Robot --> drum synth

One of my earlier posts detailed my abandonded plans for a kids' robot I called Cricket. But the brain module worked just fine and featured a bunch of pots for real-time control, so why not turn it into a drum synthesizer?? It already had a sound system (amp and speaker) and a big yellow arcade button as a trigger.

Cricket the drum synth: The Cricket brain module with more pots, no motors.

The brain module was packed full of cables, controls, and LED diffusion pods, so my first goal was not to modify any circuitry. (I added pots and switches to the front panel, but they plugged into existing headers.)

Second, I didn't want it to take forever, so I kept the feature list simple:
  • Two digital oscillators with selectable waveforms (saw/triangle/noise/50% square) and independent pitch controls
  • Two-stage (attack and release) amplitude envelope
  • Global pitch LFO with controls for speed, depth, and waveform (same choices as the oscillators)
  • Osc 2 -> Osc 1 frequency modulation, with adjustable depth and a high/low pitch range switch for Osc 2
  • Selectable AND/OR/XORing of the oscillators with each other
  • Digital wrapping/clipping distortion with adjustable depth  
(Yes, there's no filter. Deal with it. :) 

I had a switch left over, so it selects the direction the LEDs light up while Cricket is running. There's also an "in" jack that I might in future use as a footswitch or external trigger input. (The MIDI jack isn't functional.)

I'm really pleased with the sound! It's digital and raw, but also organic and surprisingly varied. You can get kicks, snares, metallic plinks, noise bursts, bass sounds, and even vocal-like screams and yawps. I got what I wanted and then some... pretty good for 8-bit waveforms pumped out of a single pin of the dsPIC using 6-bit PWM. The video below is pure multitracked Cricket -- no processing except for a bit of autopanning:


Here's another video with a more detailed exploration of the features and sound:


Thursday, June 5, 2014

The magic of speech synthesis: linear predictive coding

Growing up in the '80s and '90s, I had a pretty decent idea how a lot of tech around me worked. Maybe I couldn't actually fix a TV with a blown tube or swap out a dead (soldered) CPU on a motherboard yet, but I knew how the big pieces fit together, what they were supposed to do, and what might happen if a given piece went kaput.

Speech synthesizers were not in that category.

When I first encountered a Speak 'N' Spell, it seemed like magic. The voice was so crude and inhuman it was obviously computer-generated (i.e., not recorded). It was halting and seemingly stitched together from scraps of speech, but I'd never even heard of phonemes, let alone a process by which a chip like the one I found inside could spit out words and phrases.

For a long time, I had an inordinate fascination with the SnS, the General Instruments SP0256-AL2, and the speech synthesis cartridges for the TI-99/4A and TRS-80. (Wasn't there a C64 speech cartridge too?) I never did find out much about how they worked, though, or get my hands on hardware to experiment.

Linear Predictive Coding: Speech Analysis, Synthesis, Compression

Fast-forward 20 years or so to DSP class... and it turns out that most of those devices, along with a healthy amount of speech synthesis today, is based on variants of the linear predictive coding (LPC) technique. For my class project, I worked up an LPC example in Matlab to peek under the hood.

LPC models the human vocal tract as a medium-order time-varying filter (typically 10th-order) excited by pitched and unpitched (noise) impulses created by the diaphragm and vocal cords. A speech sequence (e.g., a word) is created from a train of impulses filtered with changing filter coefficients and gain.

LPC discretizes speech into overlapping frames of 10-20 ms, where the filter coefficients, gain, and impulse type and pitch are constant for a given  frame.

LPC is most commonly used as a compression scheme: speech is analyzed to estimate frame parameters, the frame parameters are transmitted using far fewer bits than the original speech, and the parameters are applied to a filter and impulse train in the receiver to synthesize output speech.

The figure shows data from the whole process. From the top, there's the filtered input audio, the detected pitch period in samples for each frame, the resulting excitation signals (pulse trains in green, noise in blue) and gains, and the final synthesized output.


Basic LPC turned out to be easier and more interesting to implement than I expected... considering that I didn't write custom code for everything and that I did leave out quite a bit of work that would normally be required to tune up the sound quality, optimize computing time, and/or achieve compression specs. (Here's a great writeup on all the work that went into the Speak 'N' Spell.)

A few samples of the output:

It's pretty cool to be able to pull speech apart, in a sense, and put it back together any way you like. I'm interested in experimenting with my code to create interesting musical textures, including vocoding by replacing the impulse train with audio from a musical instrument.

Code is here!

Friday, January 24, 2014

Lego Segway with minimal-order observer control

Self-balancing Lego robots are nothing new, but everyone uses PID controllers. I wanted to implement an observer controller to do something new and flex my controls muscles. 

I built a Mindstorms robot that uses a light sensor to measure light reflected off the floor and thereby the robot's tilt. This turned out to be finicky since I had to set the zero point manually, and ambient light variations screwed things up fairly often. It worked well enough in the end though.

Controller Design
A full-order observer controller uses a model of the system in the control loop, which allows us to observe state information that would otherwise be hidden in the actual system. We can then use that state info in the feedback to reduce the error, which now incorporates both the system and model outputs. This can be a robust way to control high-dimensional systems while also being able to inspect the (estimated) states for useful insights.

However, we may not actually need all the state information. A minimal-order observer (aka functional observer) still uses a model, but requires fewer poles to be chosen than a full-order controller. That simplifies design and eliminates the need to calculate and compute state-space transformation matrices.

The figure shows the minimal-order observer, with the controller elements labeled as psi 0 and psi 1. In the lower diagram, psi 0 is algebraically combined with the summation block to simplify coding. As noted, each psi function is a ratio of (simple) Z-domain transfer polynomials.

Minimal-order diagram in Simulink. In the actual system, the real robot takes the place of the "Linearized Model".
I coded the observer controller in RobotC with the help of a couple of Matlab scripts to choose poles and calculate the coefficients of the transfer polynomials. I could have put more work into accurately modeling the robot (weighing it properly, etc.), but as you can see, it works well enough.

The video's a bit long, to show the balancing stability - skip to 1:30 to see me driving the robot with a joystick over Bluetooth. Driving could use some smoothing, but it's fun.

Code is here.

Sunday, June 16, 2013

Tunes are go!

Holy Roland, I can't believe it's taken me until 2013 to move my music hosting off MySpace! The only thing more embarrassing is that people have invested money in MySpace in the interim... good luck Justin.

Anyway, I've got three albums up: Singlestar (the latest) along with collections of tracks for both film music and older stuff. The site is here, but I've also embedded players below. Enjoy!

Krylenko (Bandcamp)


Composed - Music for Film

Collected 1999-2009

Wednesday, June 12, 2013

Remixing a mixer

Is there a recording musician who hasn't owned a Behringer mixer? They're cheap as chips and do what they say on the tin.

I'm surprised my current model is only the second I've owned in 15 years of mucking about with music gear. It's a tiny thing, but just about perfect for the space I have and inputs I need. That said, it didn't come with an aux send. Those are super-useful, especially with my new spring reverb, so I decided to add one.

A bit of parts diving, soldering, and gluing later and I've got a mono out, stereo return aux bus. Had to scrap the tape I/O but don't think I'll be missing it. Here's a pic of this truly classic Junkbox Raider mod:

I could have made it uglier, but I ran out of time.


Monday, January 28, 2013

I broke a what?!?!

I've busted a lot of stuff over the years - mostly the poorly constructed and therefore delicate projects I'd built, but also plenty of electronic components, hardware, circuit boards, etc. I've even broken and bent a few small tools.

But until yesterday I'd never snapped off half a pair of needlenose pliers so cleanly it looked like they'd been sawed apart. How's that even possible? (Sure, my finger strength is unparalleled, but I wield it gently. :)

I'd post a picture, but I can't be bothered to dig 'em out from under the pile of Robosapien discards clogging up the trash. Trust this random guy on the Internet, though - it really happened.

Maximum information, minimum post

I've been planning for a while to write up some research I worked on in 2011 involving intrinsic "motivation" for robots. We got a workshop paper out of it, and I presented the results to the ECE department last year. I also planned to extend it into my thesis project.

But... the lab went through some advisor round-robin and the project fell apart, and I just don't feel like writing it up into a full post anymore.

In a nutshell, our robot learned a policy for a partially observable Markov decision process (POMDP) to learn about objects in a space by manipulating them with its arm, then assigning object classification probabilities, with Shannon information gain across all objects as the learning reward.

Here's the AAAI workshop abstract, with a link to the full PDF:

Here's a fun picture of the robot!