Saturday, December 29, 2007

The Countershaft

The lathe's countershaft is an improvised replacement. It's some manner of hard steel rod. It was perfectly suitable except that it was 5" too long and was protruding from the pulley that connects to the motor (hereafter known as the drive pulley.) To make matters worse, the last 6" of the rod was threaded. I decided to address this unsightly safety issue.

The smooth part of the shaft wasn't long enough which left the drive pulley sitting on threads. This didn't seem sensible to me. I determined the shaft's proper length and cut the excess off. That was a good start but it left the threads under the drive pulley. I removed them by turning the last inch or so of the shaft down to 1/2" from 3/4". Then I made a sleeve that would bring the diameter back up to 3/4" by boring a 1/2 hole in an aluminum rod. I pressed this to the shaft. It was a tight one-way fit. Finally, I turned the sleeve to the proper diameter, faced the end of the shaft, and broke the corners.

This was the first press fit of this size that Marco and I had ever done. It could be likened to monkeys doing math.

If I'd had the steel, I would have made the sleeve from that. We'll see how long the aluminum lasts.

Here are pictures of the countershaft in place.







While I had it all apart, I decided I didn't like the retaining collar on the right end of the countershaft and the pulley on the left end rubbing against the casting. Didn't seem too make a lot of sense. As an experiment I cut some washers from a plastic jug. We'll see how long they last. I don't expect they'll have too much pressure on them.

Saturday, December 8, 2007

Repairing the bull gear and cone pulley

Probably due to the previous 'fix' to the lathe, there is a considerable amount of wear on the hubs of the bull gear and cone pulley. As far as I can tell, there was no bearing between these parts. They were pressed hard together by the force of the spindle and frequently spun at different speeds. The Zamac hubs rubbed and eroded themselves down. Eventually, the outer edges of these parts started rubbing together and caused scoring. A bronze bushing in the cone pulley stopped the damage to that part, but the bull gear kept getting chewed up, if at a slower rate.

I decided to make room for a bearing between the bullgear and cone pulley. I'd size it so the edges ended up close but wouldn't touch.

The bull gear's right hub has seem some mysterious wear. I don't know what caused it. It was unsightly so I removed it. It also prepares for a possible retainer bearing similar to the one I put by the change gear. In addition, there are 60 holes at the edge of the bull gear for indexing. As is all too common, the retaining pin got pushed in while the lathe was running and tore up the holes. I smoothed the hub and indexing holes.



I turned the bull gear over and took about 90 mils off the the hub area. Note the damage to the edge of the piece - this isn't machined. That's where it wore against the cone pulley.



I turned my attention to the cone pulley. It's hub was not nearly as damaged due to the protection offered by the bronze bushing. I ran over it with Marco's mill. Again, note the damage towards the edge. That's where it rubbed against the bull gear.



This picture shows the bull gear with the sintered bronze 'oilite' bearing in place. The bearing is proud of the surface to prevent the wear around the edges of the parts.



Here is it all together. The cone pulley and bull gear are protected from one another by the bearing. Since the bearing is thicker than the amount of material I removed, the edges of the parts no longer touch, as seen in the following picture.



I need to file a few flats in my spindle for set screws. But otherwise this completes the headstock repairs.

Sunday, December 2, 2007

More lathe stuff

I ordered parts from www.mcmaster.com (McMaster-Carr). They have a superb web site. You pretty much describe what you're looking for, and the site presents you with a list of matching parts and additional selection options. Before long you're down to a part or two and you can select the one that suits you. It's painless.

On thing I discovered about my spindle recently is that it's not like the others I've seen on the web. The bearing shoulder on mine is fully 1/2" shorter. That, and the lack of a flange bearing, was why the headstock had that weird thrust path.

Here's the spindle. The part we're concerned with is the shoulder in the middle of the image. I have to get the force from that shoulder to the thrust bearing.



I ordered a remarkably thin needle thrust bearing and a sintered bronze flange bearing. My friend used his Sherline to part the flange bearing to length. In this picture you can see how the bronze bearing channels the thrust from the spindle shoulder to the thrust bearing.



The back gear isn't really secured now that I removed that huge bearing. Originally I was going to secure the gear to the cone pulley but I've nixed that. Instead I turned a UHMW retaining ring that consumes the empty space between the back gear and the thrust bearing. Here's the ring. Note how it rides on the flange bearing. The ring fits relatively loosely and, being UHMW is very slick. If all goes according to plan, the back gear will not be able to affect the bearing.



Here it is all together. The back gear has no where to go.



Now there is still a little work to do on the spindle. The mating hubs of the bull gear and cone pulley are abrading one another. This is a very high-wear part of the design. Perhaps over time a bearing between has been lost. But now the hubs have rubbed down and allowed the outer edges of the parts to touch and scar. I'm going to turn the hubs smooth then remove enough material from each so a 1/8" washer bearing fits between them. I'll leave the bearing a little proud so it prevents the outer edges from touching. This should be easy enough.

Stay Tuned!

Friday, November 16, 2007

The headstock

I've been enjoying tinkering with the lathe, which was the point after all. I put the new back gears on, no problems there to speak of.

I pulled the spindle off. Due to heinous hackery the spindle's bull and back gear didn't line up with the mating gears on the back gear shaft. It was an adventure getting it apart. The biggest problems involved set screws - a previous owner had cranked them down so tight they left divots on the spindle. The divots prevented the close-fitting gears from coming off the spindle.

As I examined the spindle, I discovered something very wrong with it. The 'path of thrust' through the spindle was plain crazy. When you put something in a lathe, you normally squeeze it between the headstock and tailstock. Or the manipulate the cutters in such a way as to push on the headstock. This can transmit a lot of force to the thrust bearing in the headstock. With this lathe, the path of force was:

The nose
The spindle
The bull gear's set screw (!)
The bull gear
The cone pully
The back gear
The ball thrust bearing

That is not correct at all! None of the gears on the spindle should be in the path of thrust! One thing I noted was that the spindle was a 1.25" OD rod. The right 80% of the spindle has a steel sleeve that brings its diameter up to 1.5". Why is this? One reason is that the step formed at the left end of the sleeve is a dandy place to put a bronze flange bearing. This transfers the force to a ball thrust bearing that's against the left babbitt bearing.

So with this scheme, the path of force is:
The nose
The spindle
The bronze flange bearing
The ball thrust bearing

Much better!

As I was planning this out, I noted that the back gear isn't really attached to the pulley. There's nothing to keep the back gear from sliding around. Previously the thrust bearing kept the gear captured and led to the silliness described above. The back gear rides on a brass bushing that protrudes from the pulley. I'll pull the bushing out a little, perhaps 3/8", mill a groove very near the edge, and use a spring clip to capture the gear.

Sunday, November 4, 2007

Lathe Stuffs

I've been having an exciting adventure with my lathe. As described previously, it's a 'project' lathe. That is, it is a project simply to get it running! One silver lining is that I can't really hurt it. So I've been tearing it down to see how it works. First, there are a lot more individual parts than I thought. You don't see them until you really pay attention. Second, I know how the lathe works now. Previously it was abstract "this turns this" sort of general idea. Now I know precisely how it works, and I know what the parts are called.

While my lathe was made in 1939 or so, the Clausing corporation still has some parts available! Amazing. They were kind enough to send me a parts list. You can see it Atlas/Craftsman 12" engine lathe parts diagram 101.07383. I can use the part numbers to find suitable replacements on eBay.

Now, for the dirt.

I took the tailstock completely apart. Something very bad happened to it. I suspect whatever it was destroyed the hand-wheel and bent the ram screw. What's left is a well-intentioned bastardization. The worst part is that the culprit didn't have a left-handed Acme thread laying around (who does!) to make another ram screw. So he bored out the ram's left-handed Acme thread and re-cut a common right-handed thread. This means the tailstock works exactly backwards to every other tailstock on the planet. Fortunately, fixing this is relatively easy - I'll bore THOSE threads out and press in the appropriate nut. Now, the machinist did a commendable job of turning a large bolt to make a new ram screw - it isn't an easy thing to make. It sure looked like a lot of work - far more work than spending a few bucks on a left-handed threaded rod or a few more dollars on a new ram screw.

I took the backgear assembly apart too. I was not pleased - again, something rendered the backgears unusable - they wouldn't stay engaged. I saw another full assembly on eBay and bid on it, as a lark. To my surprise my low-ball bid won. Comparing the new assembly with the original, I can see my backgear shaft has been replaced, and badly. Instead of turning a replacement shaft, the person found a section of pipe... Now, the backgear shaft is indeed hollow and contains another shaft which is on an eccentric. This eccentric is used with a lever to engage the back gears when appropriate. The fit of the eccentric shaft within the section of pipe was poor to say the least which is probably why the back gears would not stay engaged. To add insult to injury, the pipe had a larger diameter than the original shaft. So the guy bored the gears to fit. Now they aren't even useful as backup parts. Terrible decisions.

I'm not comfortable running the lathe without all the safety equipment. I scored an Atlas spindle pulley guard on eBay to replace mine, which had been lost sometime in the last 70 years. Atlas made the Craftsman lathes so the part are generally interchangeable. It's a tight fit - the bull gear actually brushed up against it. In addition, the new large back gear is wider than the old one and crashes against a large washer that's beside the mating spindle gear. So I have two problems to address - but it looks like with some creative reordering of washers on the spindle, I'll be able to make this all fit.

So far I've spent less than $35 on eBay to get critical parts. I still have to buy the left-handed tailstock screw and nut but those shouldn't be too much.

The original purchase price was quite reasonable, I'll have this running for about half the price of a well-cared for lathe.

Monday, October 22, 2007

Ms Stormcrow makes a pleasing quilt

Ms Stormcrow has nearly limitless energy. Here's her latest quilt. Each block is made of various dark fabrics and several scrappy light-valued concentric circles. The circles are machine appliquéd and left rough-edged so they fluff up in the wash.

Sunday, October 21, 2007

Eagle Ceremony

My son earned his Boy Scout Eagle rank some time ago, and we had his ceremony recently.

Here's a before-ceremony shot. Why the pensive look? Because Thomas and Trey are rough-housing with gusto, and he's probably wondering if he's going to get an elbow in the face.



Here's the after-ceremony shot, sporting a new neckerchief. The time does fly by, indeed.



Here's a pic of Justin and his troop participating in Justin's Eagle project. Justin and his troup designed and built a privacy fence around an exposed portion of Raptor Rehab of KY. This pic says it all - all the boys are working together.

Caveat Emptor

I've been examining my lathe. Fundamentally, it's sound. However, practically, Houston we have a problem :-D

This is probably a common situation when you don't really know what you're buying, you can't really inspect what you're buying (ebay), your funds are limited so you can't just go buy a showroom piece, and the item must be within a reasonable drive.

The issue is that as far as I can tell, it's been dropped. There are too many cosmetic issues to chalk it up to happenstance or age. Here's the score of defects:

1. The tail stock has a non-original handle, a chip out of the casting, and works oddly.

2. A bracket holding the lead screw has been welded.

3. The handle on the cross slide has been replaced with a faucet handle (remarkably, it's very comfortable to use.)

4. The pot-metal carriage traversal mechanism is broken. The seller did give me replacement parts, however. I don't know if they fit!

Those issues are cosmetic, really. I probably need to rebuild the tail stock and see what's going on inside there.

The other issues are

4. The motor's bearings are shot. This is a 1 1/2 hp Baldor. If I can replace the bearings I'll have a prize. This motor isn't original and is about 4 times more powerful than the original motor. I'll probably put a spare 1/2 hp motor on it and proclaim victory. And sell the Baldor.

5. The counter shaft isn't original and the cone pulley was slipping. I could make it work pretty easily. However, this is a key part of the lathe. I can't use the lathe to make this part. I need to make this right since the rest of the lathe depends upon it.

6. The back gear shaft is also not original. Chances are, there's some damage to one of its mounting bracket castings. I'll remake the shaft, should be easy. The problem is the back gears won't stay engaged. I get conflicting reasons why from the web. I expect I'll be making a new part that will keep the arm engaged.

7. All the bearings need to be inspected. The babbitt bearings look good, thank goodness.

8. I need to replace the lost-to-time gear cover as a safety issue.

So the lathe is indeed a project. But not everything needs to be done at once. Once the counter shaft is repaired and I strap on a new motor I can run the lathe.

This lathe, an Atas/Craftsman 101.07383, is pretty old, and as it turns out there isn't much information on the web about it. Clausing was kind enough to send me a 1941-ish parts list, however, and it is posted here with permission.

Saturday, October 13, 2007

My new lathe

I suppose the surest sign of getting old is buying a lathe. Especially a metal lathe. My lovely wife said, "And exactly what are you going to make with that? napkin rings? Candlesticks holders?" I was like, "I will probably use the lathe to make accessories for the lathe."

She just rolled her eyes and walked off.

So, I acquired a 1939 Craftsman 12" lathe, model 101.07383. It isn't a very popular lathe but it will do nicely for a while. The previous owner, Jim, was a remarkable man and a professional machinist. In his spare time he build and flew an aircraft. Around town, he was the guy you went to when you needed something fixed. He could tell you how to fix it or make the part you needed. Jim passed about 2 years ago.

I'm not really sure what happened to this lathe, but it is showing signs of being used and maybe abused. Most of the more ornate pieces are gone, replaced by more mundane parts. For example, a wheel on the apron has been replaced by an old water faucet handle! Perhaps this is what a working man's 70 year old lathe looks like.

My list of projects are:

1) Replace the motor, or repair the bearings.
2) Replace the counter shaft
3) Find out why the back gear shaft won't stay put.
4) Repair the carriage's traversal gear assembly

So once these are done, the lathe should be operational if not perfect.

Fortunately, I have access to a spare motor. #1 is handled for now.



Here's a picture. I'll get better ones up when the new basement shop isn't dungeon-dark.

I got the counter shaft off tonight. It isn't original and needs to be replaced. This should be cheap on ebay or easy to make. It isn't much more than a steel rod with two flats on it. I will have to learn how to replace bushings however.

Wednesday, August 22, 2007

Some sort of pepper sauce

My son grew a lot of jalapeños. We didn't know what to do with them all, so in a fit of desperation I made some type of pepper sauce.

I read that roasting peppers improves their flavor. I certainly wasn't going to roast them in the house. And I don't have a grill. But I do have this little George Foreman grill thing. It's great for sandwiches and so forth. I figured I had nothing to lose so I took it outside and got it warming.

I washed the peppers, sliced the stem parts off, sliced them down the middle, then pulled out and discarded the seeds and the other stuff in there.

The Geo Foreman grill is yer basic clamshell grill - there's a cooking surface on the top and the bottom. I put the pepper halves face to face in the grill so the outside would be exposed directly to the heat and the insides wouldn't be. I let them roast for ten minutes - when the skins start turning black you're getting close. In retrospect fifteen would have been better. I took the roasted peppers inside and started peeling the skin off. The skin comes off easily if the peppers are well-roasted.

I grilled and peeled a few batches of peppers. Then I pureed them with a handful of sesame seeds and enough balsamic and white vinegar, in equal portions, to make it blend well. And maybe 1/4 teaspoon of salt. Not too much. Puree like the wind!

I put the resulting mixture into a pot, brought it to a decent boil, then turned it down to simmer for about 15 minutes.

That's it.

I think it's going to go very well on chicken or pork.

Tuesday, August 21, 2007

So busy

School has started and the new programming job is ramping up fast. I haven't had much time to do anything interesting lately, much less write about it.

The wife and I did can 21 more quarts of tomatoes the other day. We're going to have good soup for the next 2 years!

The Ruby-on-Rails framework/language I ranted about previously is turning out to be pretty effective. So far it lets me do what i need to do, and easily, as long as I stay within some pretty reasonable boundaries. For example, it won't let me join two tables if there's no relationship between them. I think I can live with that.

Monday, August 13, 2007

Ruby on Rails, part 2 or 3

I've been working on an email delivery system. Today I added some safeguards that will prevent it from mailing the same content to a subscriber twice in one day. In addition to keeping the boss from screaming, it allows me to recover from delivery errors elegantly. It was easy enough to do. Then I added an admin facility that shows the status of the delivery run. Again, easy enough. To do this I made a new method in the controller. This was a one-liner to fetch the data. Then I looped over the data in the view, generating an HTML table as I did so.

Saturday, August 11, 2007

My wife's and son's garden has produced

We had tomato overload. Fortunately the wife knows how to can. We also have a plethora of jalapeños and banana peppers. I have no idea what we're going to do with them all.



Friday, August 10, 2007

More Rails

I've been heads-down learning Ruby on Rails at the new job. So far, so good. I'm working on an email delivery system using PowerMTA as the back-end. I didn't have too many problems writing the daily delivery selection code. I did have to stray from the framework a little to deliver using different templates. Sometimes these sorts of deviations make frameworks really collapse. Fortunately, not this time. I also wrote code to track opens which was very easy.

I celebrated by getting all my code checked into SVN and leaving the office today (Friday) at 5:01.

Thursday, August 2, 2007

Friday, July 27, 2007

Ruby on Rails

Well, the new job is a web startup that uses Ruby and the Ruby on Rails web development framework. So far so good. I generated a cheesy little app earlier this week.

Ruby is a scripting language that's sort of like a well designed, objected-oriented perl. Rails is a framework that stresses convention over configuration. This is orthogonal to the current Java and .net frameworks out there. Configuration of a Struts application, for example, is a pain in the butt. Yet Rails accomplishes the same task with almost no configuration. The reason is that with Rails you're expected to lose your ego and follow a few pretty reasonable conventions. For example, all table names are expected to be plural, such as "customers." All tables are expected to have an auto-increment column called "id" which is the primary key. When you follow the conventions, Rails generates workable web pages and Ruby code that create, edit, delete, and update rows in your database. The code produced isn't just hard-coded glop, however. It's reasonable Ruby code with plenty of hooks in it for inevitable real-world modifications.

Most web pages aren't too involved. They really come down to getting data from a database and displaying it and updating data that is already there. At my previous employer, we used an ancient VB3 application to making changes to reference tables (states in the country, transaction codes, etc.) While these rows didn't change often, when it did it was a pain in the butt. The application was flaky and didn't update all the reference tables. Due to auditing rules, we had to jump through hoops and otherwise waste a lot of time to use SQL in production. And there was no time to rewrite applications that were for internal use only! The end result was an permanent time-wasting application that was using obsolete technology, was buggy, and was a partial solution. Rails would have generated the bulk of the application in about 5 minutes.

Now, there are some complex web pages out there. I have not yet done anything really complex in Rails. So the jury is out regarding how much of Rails is just hype. I recall the hype surrounding Visual Basic. The demo was always a rolodex, and it went together quick. (insert oooos and ahhhhhs from the PHBs) Then, after the purchasing VP and his VP pals went out to a victory lunch, the programmers would figure out that when you got off the beaten path, the application framework collapsed and had to be entirely coded by hand anyway.

Here's some particularly egregious Rails hype:

[Rails] was super productive and it was super fun, too! A community of contributors grew up around it to make it even more productive and more fun! Rails has been growing fast. It has capabilities that you have to see to believe!!!


And in the same page, there's a little play, where the savvy programmer shows off in front of the boss:

Boss: Hey, CB ... you say it [Rails] lets you produce code like ten times faster than the tools we're using now?
CB: Yep. It's called Ruby on Rails, and it's at least 10x.


The web is full of statements like this. Maybe they are true. But such too-good-to-be-true comments make me very wary. I've been around the roundhouse a few times!

Here's the article. Actually, if you want learn a little about Rails, it's really good, cheerleading aside.

Sunday, July 22, 2007

Another 10 mile hike

My scout troop got in another 10 mile hike on Saturday. We went to Bernheim forest, just south of Louisville. The weather was perfect. This is the last hike we need to complete one boy's hiking merit badge. This also is his last requirement for his Eagle badge. His Eagle ceremony will be combined with my son's, as well as something for the scoutmaster, who has really gone above-and-beyond.

Here's a link to a topo.

Thursday, July 19, 2007

Detox

I'm between jobs now. Actually, my next job starts Monday, but I'm taking some time off before it starts. Just detoxing from the previous job and doing things around the house. The wife isn't lashing me too bad :-)

My son and I worked on the porch some...



Ms. Stormcrow makes lots of quilts and wanted to display one in the kitchen. I designed a quilt rack to her specs. A decade ago, I was going to make a Arts-and-Crafts-style bed, but it never happened. So I had plenty of white oak sitting in the shop taking up space. Much to my dismay, the oak was all still rough cut, so I selected some planks and hand-planed a straight edge on each with a Stanley #8C. Then I cut the lumber to width on the table saw. I planed it to thickness using a planer.

I mention the planer since I love the freedom it gives me. I don't have to accept the standard 3/4" thickness. I left the brackets at 7/8" so there would be plenty of wood for the hanging hardware, thinned the top from 15/16" to 3/4" to save weight, and made the hanger bar at something like 9/16" so it would fit through the brackets easier.

Other that the wood being gnarly, the grain changing directions several times on the edge I was trying to bullnose, everything went smoothly. I finished it with some typical Minwax stain and a coat of water-based poly.

Today I hung it in the kitchen using some really excellent drywall hangers.

She's happy. When momma's happy, everyone's happy.

Saturday, July 7, 2007

The AK-47 turns 60

The favorite weapon of freedom's enemies turns 60 today. It's quite a big to-do in Russia. Vladimir Putin is making a show of it. Why not, it's about the only they ever built that's worth a damn.

Here's an article about it.

Two haikus:

No more retreating
Dreams of farm machinery
Instead, Kalashnikov

(I respectfully beg an extra syllable due to the long Russian name.)

Noble intention
Unintended consequence
Adolph laughs in Hell

Tuesday, July 3, 2007

Haiku

Empty cubicle
Forgotten papers rustling
Someone took the chair

Saturday, June 30, 2007

Voting with Your Feet

I'm a computer programmer by trade. One of the great "innovations" in the management of computer programmers and their craft is the outsourcing of work to offshore "resources." That means (Asian) Indians.

This was discredited as a general purpose long-term strategy about 10 years ago. The reason is that the company loses the nuts-and-bolts knowledge of how their software operates and thus becomes dependent on the consultants. They lose the technical ability to check their estimates and designs. They also lose the drive to improve their technology. Technology changes fast. Most innovations are just hype. Some aren't. Managers don't know which is which. The consultants have no incentive to rock the boat. The result is slow turn-around on projects (the company is locked in), poorly vetted designs (the company has no technical eyes), and stagnation (no innovation.)

My current employer, however, never got the memo.

This destruction doesn't happen immediately. It sneaks up on management because the consultants are soooo nice and take care of soooo many troublesome details. The people displaced by the consultants move on and thus the trap is sprung. As the company starts bleeding employees, project deadlines become at risk so they hire more consultants to pick up the slack. It spirals. And which employees leave? The ones who are the go-getters and want to build and accomplish. And which type of employee remains?

To make matters worse, our company used consultants as a substitute for planning. That is, a VP somewhere demanded that our software be internationalized so we could tap the entire world market. Now, is this rocket science? DUH TAP THE JAPANESE, KOREAN, AND EMERGING CHINESE MARKET. For the last three years, at least, that should have been a constant goal as we maintained our software - I18n software works in the USA, too! But we sat on our thumbs instead, always doing what was easiest and had the lowest risk. The VP set the completion date of The Blessed I18N Event with little technical input. It was arbitrary and artificial. We don't even have any clients who need the new software. And the ONLY way to make the artificial deadline was to go to India.

The decision to outsource my job was made six months (or more) before they bothered to tell me. Why upset the little people! So instead of spending my time documenting and preparing for it (I am a professional...) I got "Oh, by the way, the project you've been working on for six months is canceled, these Indians are going to do it instead."

Lovely, eh.

The executive who made the decision to outsource has been whacked, and the guy that followed him has also been whacked. They operated from a different and somewhat competing office. I suspect their motives weren't so pure. Good riddance.

When I met the customer-facing Indians I was pleasantly surprised by their competence, diligence, and humor. In short, I tried to dislike them but couldn't! They are really good, pleasant people. I like them all. My discomfort is not their fault.

But the fact remains that my job responsibilities are nominal and my team is destroyed. I don't code anymore and the new architect has made it clear I am not to make any decisions - that's his job. >:-|

So, three days ago, I voted with my feet and resigned. It sucks but I didn't see good odds of any improvement within the next few years. The architect is younger than me and more athletic, so I can't even out-live him. :-D

I'd been seeking a good job for about 9 months. I didn't see a reason to go from one unsatisfying job to another.

But mutual need happened at just the right time, and I landed on my feet at a web startup being run by some pals. Pretty sweet. Perhaps a nominal pay cut but I'll be challenged, have more job satisfaction, and some camaraderie. I need these things.

I'll be doing a lot of my work using Ruby and Ruby on Rails. Learning a new technology will be nice. In the meantime, I suspect they'll want some stuff done in Java too.

The new company is called The First 30 Days and is about life changes. Like changing jobs, for example.

Sunday, June 24, 2007

Vapor and Lathes

Computer programmers don't construct anything you can touch, see, show off, or really measure. More often than not, software we write is discarded before it's ever used. Really. We create magnetic patterns on a spinning aluminum plate. Truly vaporware.

So I am drawn to machine tools. Metal is permanent, tactile, heavy, shiny, sharp, smooth, hot, and cold. You can give it to someone. You can create something useful from a lump of something that isn't. Machining is anti-programming.

So now I'm searching for a metal lathe. Not something huge because I ultimately have to move it. But something just right, like an Atlas 618 perhaps...



(link)

So I'm looking at ebay trying to find a suitable lathe that's affordable, in good shape, and within a reasonable drive. Slim pickin's.

Wednesday, June 13, 2007

Enigma

I've always been fascinated by codes and ciphers. Lately, I've a renewed interest in the Enigma. Here's a virtual museum. I've exchanged emails with the fellow who runs it. It's a labor of love, and it's clear he really enjoys it.

Check this Enigma-like machine out. The builder is a remarkable woman, her site is here.

She's probably smarter than you, deal with it ;-)

Monday, June 4, 2007

Atlas Shrugged

My friend Jeff badgered me until I read this book by Ayn Rand. It is, overall, tremendous.

The premise is that in an alternate ca1960-ish USA, the world is a mess. Europe and South America are all "Peoples' States". The US is sliding that way as those industrialists who invent, produce, and do are plundered more and more by government looters and moochers. People are taught that rational thought is pointless - only needs (especially others' needs) matter. Man has stopped thinking, allowing others to define his values and decide his actions. The looters' premise is that each person is required to work to the fullest of their ability and be rewarded by measure of their need. The skilled machinist may have to work 18 hours a day - to his ability - so he can feed a lazy man's 12 kids. Those kids _need_ that food. As the new governing body, a central planning board, ascends in power, their looting and evil accelerates. In addition, for everyone who buys into the central planning agency's 'thought is useless' philosophy, there's a near-total abdication of personal accountability as there is no incentive to take any risks - people are rewarded by their need, not their accomplishments.

Suddenly, Industrialists - the movers and shakers, the employers, the inventors - begin disappearing without a trace. As they disappear, without suitable replacement from the labor pool of moochers and looters, society starts to fail.

The book focuses on three industrialists - one olde-blood Spanish elite whose family made a fortune in copper mining, who forgets his legacy and falls into debauchery, a self-made iron magnate who creates an incredible alloy, and a woman who runs the oldest and best railroad in the country.

Read the book to find out what happens when these industrialists self-sufficient attitudes collide with the new philosophy of self-sacrifice for ones' neighbors and their needs.

This book is about 1100 pages, sometimes it's very dense reading. The truth of the matter is that it could probably be edited down to 800 pages. Or, I am too unsophisticated to understand that the parts that seem repetitious to me are in fact truly different. A most remarkable aspect of this book is that, while it was written in 1957, it reads like it was written last week.

Highly recommended.

Saturday, June 2, 2007

Eagle Scout

My son earned his Eagle Scout award Thursday. He's a fine young man.

Saturday, May 26, 2007

Ruby Fun

I've been looking into Ruby lately. For the halibut, I decided to create a rotor-based encryption program in the spirit of Enigma. Easy peas, once I quit thinking I knew how things worked and started modeling the actual machine.

Here's a snippet that shows rotor #3.


class Rotor3 < BaseRotor
def initialize()
super
@rot =
[
[22,2,0,14,25,21,24,3,17,4,19,8,11,18,10,5,9,12,6,15,16,20,23,13,1,7],
[2,24,1,7,9,15,18,25,11,16,14,12,17,23,3,19,20,8,13,10,21,5,0,22,6,4]
]
@notches = [3,5,16,19,22]
end
end


The @rot array-of-arrays shows how a pin on one side of the rotor is wired to a pin on the other side. The @notches array is analogous to a "notch" in the rotor where a pawl would be able to engage a ratchet, spinning the rotor adjacent to this rotor. In this example, when rotor three rotates from position 3 to position 4, the next rotor should also be rotated.


EJEXCBTZHHSSPGEESJEWVNJKRGKDMDUSLPIIFUUBDDYZLAOZKK
OFAUCVITGJMNFPJFNRFZZXUPUQUSZHMZPFKPBLWURBTYCHTNOR
DVSQEGXCJIHZCHRIEQOHJLLSQJFMOKQMTRGWMFIEWQHTKCAFIV
QUALZTNCRLTMFRJROXTKQVIPPIRTVHZIRIQRSBIBSBOZFRSLKV
YEIHSVHYSJHFPAKEZMTEGURWJOKGZNBVDFZKKKHIRRMHYVUEFC
XHMEPXFABFCGCZCMYABPWTVZXBVGEDRNEKUYGXCN

Saturday, May 19, 2007

SPAMalot!

I was fortunate enough to win tickets to see SPAMalot, the stage adaptation of "Monty Python and the Holy Grail."

Parts of it were so funny it made my head hurt.

No spoilers, just go see it if you can. It is as irreverant as the original but in different ways.

Thursday, May 17, 2007

Ruby on Rails

Taking a break from the TAS project - I need to rethink my code generation strategy as it pertains to the operator precedence parser.

I installed Ruby on Rails last night. Install itself was easy. I found this excellent tutorial and created a st00pid recipe book application. Then I modified it a little, not really knowing Ruby. I monkey-see-monkey-do'd part of my changes, and guessed otherwise. This was possible due to the 'convention not configuration' midset of the Rails designers.

Rails is very much a code generator. With all such 'rapid development' tools, the first app they show you is a Rolodex. How my eyes roll. With many such tools, when you have to write a real application, the work-level goes up and suddenly you're doing all the same work anyway.

I find the "no XML config" philosophy behind rails to be refreshing and in total contrast to Hibernate. I'm getting pretty tired of XML. I'll fiddle around with RoR for a while and se what it can really do.

Now, if I only needed a database-enabled web app...

Friday, May 11, 2007

TAS Must Die, Chapter 24

Again not a lot has happened. I need to hit the lottery and become a professional hobbiest.

Most of my work lately has been improving the operator precedence parser (OPP) as I implement more of Iscript's command set. Not surprisingly, most of the mojo is in the expressions, not the rest of the syntax.

I don't like the feature of OPPs is that they provide no mechanism for robust syntax checking. Or at least the 'student' version I started with doesn't. The problem is that I implemented a stack for operators and a stack for IDs. Easy to use but it allows the following:

( 3 2 + )

When the ) is scanned, it will cause the + to be reduced. It works but isn't syntactically correct. In reality, '2' never should have been able to follow '3'. This could probably be solved by using a single stack for IDs and operators at some loss of elegance. You'd have to implement a "can this follow that" sort of routine. The stack itself would handle matching parentheses. The action table would handle "is this token valid at all" checking.

The other thing that I believe will come back to haunt me is that I push Tokens onto the ID stack. This isn't wrong in itself, but a Token is what comes back from a lexer. By the time I'm parsing it, I suspect I need something more high-powered than a Token which isn't a lot more than a name and a general data type. This became evident when I started dealing with associative arrays. Since all I have to work with in a Token is a textual name, I end up representing v1[v2] as a variable called "v1@v2". This will cause me to have to dismantle this in the Executable which is ridiculous. I can use "v1@v2" as the variable name, but under the hood I should keep track of the discreet parts (v1, v2, and their relationship) so I don't have to re-parse them later. Basically, instead of pushing Tokens onto the ID stack, I should probably be pushing Pcodes! I'll have to investigate this.

Tuesday, May 8, 2007

TAS Must Die, Chapter 23

In order to address the issue with "[...]" having three different meanings, I implemented a TokenStream decorator that determines the context of the "[" token and changes it to something with more meaning. No problems doing it except for a constant stream of wetware failures.

I then modified the operator precedence parser to injest one the new "this open bracket signals an array" meaning. Works fine.

I'll add the other two [-tokens tonight perhaps. The big-O-n^2 operator precedence parser configuration is starting to require a lot ot typing. I suppose I have "made my bed" at this point, perhaps if there is a rev 2 I'll slim it down.

Friday, May 4, 2007

TAS Must Die, Chapter 22

Work is sucking the life out of me. Not much progress on this project.

I shifted gears a bit and installed jboss. Then I wrote a totally crappy servlet that compiles some test iscript file, executes it, and displays it to a browser. In short, I now have a stem-to-stern proof of concept.

I took a gander at some of our other iscript. I need to implement while loops and such. No big deal. The gotcha right now is part of the expression parsing.

In iscript, you can have some very similar looking stuff actually meaning very different things:

<!-- #set name=fred value="zoot" -->
<!-- #set name=bob value=x[fred] -->

Yes, iscript can't support a subroutine but by God it supports associate arrays. /rolleyes
This snippet would assign the value of x["zoot"] to bob.

Now, check this.

<!-- #set name=zoot value="zappa" -->
<!-- #set name=fred value="zoot" -->
<!-- #set name=bob value=[fred] -->

This assigns "zappa" to bob. Swell, huh? The syntax on that last line means, "get the value of fred ("zoot" in this case), and treat it like a variable name, and use its value (zappa)." So the presence of the variable to the left of the open bracket changes drastically the meaning of the construct. Nice.

This gets tricky for my operator precedence parser. The operator stack is of no help if I need to know if the previous token read was a variable. I may be reduced to checking the token stream and replacing open brackets that follow variables with a different, synthetic token. That is, if the token stream is

variable [ variable2 ]

then what the parser is given is

variable ndx variable2 ]

where ndx is an operator that gives the parser enough information to reduce correctly. Now '[ variable2 ]' looks nothing like 'ndx variable2 ]'

Another option which is especially nice is if iscript doesn't support anything fancier than 'variable [' for associative arrays, is for me to tokenize variable as an indexVariable. The it would probably be possible to reduce [ variable2 ] to an index if the topmost item on the variable stack is an indexVariable.

It's also possible to have indexes on the "name" portion of SET

<!-- #set name=x[bob] value=x[eddie] -->

Where if bob is 7, x[7] will receibe a new value. The brackets function differently yet again, however, in this case:

<!-- #set name=[x+7] value=3 -->

If x had the value of "pie" a new variable called 'pie7' would be created. Without the brackets the statement fails. I need to look into this more, however.

Also note that the expressions are pretty free-form. I am pretty sure this is valid:

<!-- #set name=[[x]+[y]] value=4 -->

so append the value of the value of y to the value of the value of x, and create a new variable with that result, and assign 4 to it.

I think this is likely the most dificult part of the syntax.

On deck:

1) Solve the bracket issues
2) Implement about a zillion test cases that show said work is reasonably correct
3) Implement the "include" executable. This is fundamental to any real test of our iscript. This will probably also make me reimplement my servlet in a more reasonable way.

Several hours of work there, easy.

Tuesday, May 1, 2007

TAS Must Die, Chapter 21

Work's been a killer lately. I haven't done much on my project. I decided, since I was pretty tired, to concentrate on getting all my junit tests to work. The biggest problem I had was that I was playing fast and loose with white space. Basically when passing HTML from the input stream to the output stream, I was losing all leading and trailing spaces. This is ok until we get some iscript embedded variables in there:

(*v1*) hello

is not the same as

(*v1*)hello

It was actually pretty tricky to get this to work properly. The parser assumes correctly that all whitespace in the iscript and embedded variable sections is removed as we read from the InputStream. Otherwise we'd have to load the parser up with explicit code to devour the spaces that appear between tokens. I had to use a modal flag so I could tell when to ignore whitespace and when not to. Most unfortunate but a more practical answer doesn't present itself.

If I were using a parser generator, inserting code to devour whitespace would be easy. Adding such code to a hand-coded parser would make the code hard to read.

At this point I need to gather more iscript sample files and continue improving the parser.

Sunday, April 29, 2007

TAS Must Die, Chapter 20

I made the changes to the Operator precedence parser as per my previous entry. Works great.

I decided that I probably had a cruft buildup in my Executable subclasses, so I started writing some junit tests. They aren't traditional low-level tests. I just couldn't build meaningful Executables without writing a lot of code or making the Executable trivial. So I ended up writing test classes that focused on testing an Executable class, but if other Executables had to be invoked, so be it.

The junit tests include a piece of iscript that tests some Executable. This is parsed then compiled in Executables where are then executed. For conditionals, I might have the iscript emit a 'true' or 'false' piece of text in the appropriate part of the if/else/endif. For expressions, I fetch the result from the global variable hashmap and compare it to an expected value.

So far I turned up a little crud, nothing too bad. I need to see how iscript handles weird cases, such as ("a"*3) or ("bob" && true). Once I know the proper behavior for the corner cases I'll be able to fix the rest of the junit errors.

Oh, and if you aren't using junit, you ought to be. It's a Sure Thing.

Saturday, April 28, 2007

TAS Must Die, Chapter 19

I expanded the operator precedence parser (OPP) to handle explicit types of IDs the right way. Then I corrected all the fudging I did with the datatypes. Suddenly the whole thing really works better. Quite satisfying.

I missed another syntax element, namely that a variable being set can also be indexed:

<!-- #SET NAME = BOB [ expression ] VALUE = expression -->

With a little agony, I was able to use the same OPP for this. The first gotcha was that by the time the recursive descent parser (RDP) has scanned BOB and [, we've read too much to satisfy the syntactic needs of the OPP. Fortunately I had built a pushback into my TokenStream class. So upon scanning "BOB[" in the RDP, I push BOB and [ back onto the stream and then invoke the OPP which scans a reasonable expression and works fine. The second gotcha was that I was using the --> token to know when the expression was over. While this works for

VALUE = expression -->

it doesn't work for

NAME = BOB [ expression ] VALUE...

Lovely, eh? I modified the OPP to accept an 'end token'. Now I passed --> or VALUE to the OPP as appropriate for the situation. This produced the right result. Even so, I don't like it - it doesn't feel like a bulls-eye.

Currently the OPP throws an exception if it scans a token which it can't find in its OP table. I believe I can simplify by using this event to inject a synthetic end-of-expression token and let the RDP handle any syntax error caused by the mystery token.

For example, the following input would cause an exception:

id1 + id2 XYZZY

Assuming XYZZY is not the end-of-expression token. What's key is that a parser notes the syntax error. This is the situation that was happening last night. Unfortunately, it WAS valid with the new use of the OPP. I changed the parser so I could specify XYZZY as the end-of-expression token.

Now, upon reading a token that isn't in the OP tables (such as VALUE or -->), the OPP pushes that token back onto the stream, and uses a totally fabricated 'end of expression' token instead. Now what will happen is that the expression will be reduced as per normal and the mystery token is available to the RDP.

Here are some possibilities:

<!-- #SET NAME = BOB [ a+b ] VALUE = 10 -->

While parseing BOB[a+b]..., the OPP scans VALUE, inserts the end-of-expression token, produces pcode for BOB[a+b], and pushes VALUE back onto the stream. Now the RDP resumes with VALUE. We're good. Then the OPP sees the 10, gets confused by -->, pushes it back, and injects the end-of-expression. 10 is valid. The --> is left to the RDP which likes it.

<!-- #SET NAME = BOB [ a+b ] VAXUE = 10 --> (note the typo in VALUE)

After the OPP scans the ], it sees VAXUE which the lexer has probably misinterpreted as a variable. The OPP checks its tables, finds it (!) and throws an error since an VARIABLE isn't allowed after a ]. Good.

<!-- #SET NAME = BOB [ a+b ] WRONGTOKEN = 10 -->

Someone typed a known token accidently. The OPP sees it, doesn't have a table entry, injects the end-of-expression token, and pushes WRONGTOKEN. BOB [ a+b] is recognized normally since it is valid. WRONGTOKEN is then scanned by the RDP which throws an error since it expects VALUE. So we're still good.

Stay tuned. This will let me unravel some of the unsatisfying crud I did last night.

Friday, April 27, 2007

TAS Must Die, Chapter 18 (or, why the STAY PUFT Marshmallow Man pwns me)

Everytime I look, I find another stinky wart in my code, lol.

Instead of adding any type of factory or if/else/switch facility to allow me to create specific pcode instances, I made the operator precedence table very very explicit. So when I reduce a '*' I know to generate a MultiplyPcode instance. Previously I generated a generic "binary" pcode with "*" as an attribute. But then getting from "*" a representative class required a Factory or switch().

Now, the explicit parser table works great though adding new Redux/Pcode/Executable classes is tedious. But there's nothing to break in the approach so once it works, it will work forever. But is it too explicit? Individual rows for every possible operator and ID type is great, totally accurate, and easy to use. But the table is getting unweildy. The very incomplete table is already 15x15 or so. I can see it floating up to 21x21.

Currently, the table uses the stack's topmost operator on one axis and the input operator on the other. The intersection identifies the action to perform.

This works great until you get 6 operators with the same precedence. Like >, >=, ==, !=, <=, <. Now I'm adding a huge number of virtually identical cells to the table. Perhaps associating the operators with a numeric precedence would have allowed the table to be structured differently and a lot smaller. Instead of:


action = crossIndex(stackOp, inputOp)


where action is shift() or MultiplyReduce(), I could have done:


diffTypeOfAction = crossIndex(stackOp.precedence, inputOp.precedence)


And now if diffTypeOfAction is reduce(), you execute stackOp.reduce(). That extra step seems to allow the table to be about 1/4 the current size without any loss of functionality. Since my operators are all left-associative and of consistent precedence (there is no a>b>c>a type of precedence chain) I think I can create a reasonable set of abstract precedences.

Making the table too explicit caused me to make another poor decision. I fudged ID-type tokens. Because I was tired of adding rows and columns to the precedence table one night, I slammed the Token's class to VARIABLE so I didn't have to mess with numeric or string constants. I knew I'd be revisiting it. But like 3 of the 4 Ghostbusters, I didn't know the form of The Destroyer. In my case, it's a bunch of Executables that don't know the type of their operands! D'oh!

Perhaps the Stay Puft marshmallow man would have been better.

Thursday, April 26, 2007

TAS Must Die, Chapter 17

I started working on code generation. For me, this is converting a list of pcodes to a list of executables, executables being classes that implement an interface called Executable. If you can imagine it, it requires implementors to define a method called "execute()".

Preliminary work on the assign, emit, and emitVariable executables went well but did point out flaws in my symbol management, or lack thereof. There are a few places where I 'forget' that some operands might be constants and these won't be found in the run-time datastructure of iscript variables. I could cheese out and check the values for a leading number or quote and figure it out that way. But fixing the problem is the right answer. This will reverberate all the way into the operator precedence parser, I fear, and ultimately cause me to add two more rows and columns to the precedence table.

It's probably just as well, my current mechanism, as I think about, probably doesn't complain about having a constant as an l-value. I don't know what would happen if I did.

Once I have the symbol issue resolved, I'll be sailing again.

Last night's main trauma was an issue caused by my BinaryOperatorPcode class. This class manages an operator, two operands, and the result of an operation. Think

result <- op1 operator op2

So I have this class which can display the pcode for any binary operator - plus, minus, divide, multiple, and, or, etc. It works great. The problem comes when I want to generate the Executable instance. But which executable! It's one thing to print a string like "+" or "*". It's quite another to create an instance of a class that does the actual work.

So I have a pcode class with any one of 15 or so binary operators in it. How DO I determine the proper Executable. One way is to use some hideous switch() statement in the pcode... I hate these, and running if-else-if constructs for two reasons. First, such statements imply bad OO design. Sometimes they are necessary but can be hidden in a Factory or Builder of some sort. Second, with a little work, you frequently find that there's no need for the switch() at all. switch() is an easy symptom-fix.

I think that this is the case here. When in the operator parser, I know each operator specifically since I have to shift and reduce based upon each operator's specific precedence. Invoking the "BinaryOperatorRedux" class was convenient but ultimately unhelpful with regards to generating the very specific Executable instances.

I'll revamp my precedence table and remove all of the BinaryOperatorRedux references. I'll replace them with references to operator-specific reduction classes. Since operator manipulation is identical for all binary reductions, specific classes will almost certainly subclass an augmented BinaryOperatorRedux class, supplying a "protected Pcode getPcode()" method. And once I have an operator-specific pcode class I don't have to do any logic to figure out the appropriate Executable. MultiplyPcode will probably produce MultiplyExecutable, after all.

The cost of this is a larger precedence table and many more classes. But the classes are all trivial. Each operator-specific reduction class adds one 'run time' line of code to the system - the code that tells which Pcode class is proper. Each operator-specifc pcode class again adds one 'run time' line of code - the code that tells which Executable is appropriate. Each operator-specific Executable class will add one 'run time' line of code to the system - the code that does the actual work.

As I typed this, I became sure that I have a good answer. I'll add a fair number of one-liner classes and reduce run-time complexity by eliminating switch() statements.

Monday, April 23, 2007

TAS Must Die, Chapter 16

Made nice progress lately. The operator precedence parser is integrated now and works fine. Discovered I had forgotten that within the HTML, iscript allows you to embed iscript variables. For example, in:

my name is (*name*)

(*name*) will be replaced with the value of an iscript variable named "name."

I added the '(*' and '*)' tokens to the lexer file and built a new iscript lexer. Then I changed the iscript parser to recognize this and produce a slightly different 'emit' command. It worked the first time. /flex

Here's a snipped of my pcode which is pretty darn experimental:


Click it for a full-sized view.

I'll probably start the servlet next. It shouldn't be any harder than anything else. :-/

I'll need a translator to convert the pcode to executable stuff, and probably a really lame caching mechanism. Then I should be able to see pages display.

I'll then be in a loop of:

1. parser doesn't handle X
2. lexer doesn't support the tokens
3. mod lexer
4. mod parser
5. view using servlet
6. goto 1 until exhausted

Friday, April 20, 2007

TAS Must Die, Chapter 15b

I implemented an operator precedence parser in a more object-oriented way. I am not sure if it's actually better than the common C implementation.

It currently doesn't have the mojo for IScript's syntax. But I could get there simply by defining precedence rules. It would probably take me about 20 minutes.

I ended up with 5 new notable classes and interfaces:

1) A ParserAction interface, instances of which perform shift, accept, reduce, and error processing. A traditional OPP includes a table of actions which are scalars. The scalars are used in some manner of switch or if/else construct to find the code snippet to execute. A more hardcore C implementation would use pointer-to-function. In my OO implementation, the table includes action class to execute. This is analogous to C's pointer-to-function construct. I currently have 4 implementations of this interface. When the final parser is all done, it would be easy to have 15.

2) A map of input-Token-to-action. The key would be the input Token. The value would be an instance of ParserAction.

3) A Map of Token-to-#2. The key is a Token (taken from the top of the operator stack.) The value is the map described in #2 above. When the ParserAction says to shift, I use the input Token to find the appropriate #2 from this map, and I push it onto the operator stack.

4) A Stack of 'operators'. This is the biggest departure from the typical 'student' OOP implementation. In the trivial case, you push operator characters onto the stack. When it comes time to pop the stack and do something, you do a lookup to determine its index into the precedence array. Instead of pushing the operator character (like a "+") or more OO Token, I push the entire map from #2 above. Now when it is time to decide how to treat an input Token that's an operator, I peek() the map on the stack, do a get() from it using the input Token as the key, and invoke the resulting ParserAction. There are no extraneous lookups in this implementation.

5) A Stack of IDs. When we scan an ID, push it onto this stack. When we reduce, we'll pull from this stack and sometimes push back a temp variable.

The code works well.

I didn't get the elegant lean code I was looking for. Perhaps I expect too much. I do have a very nice parser that doesn't feature some god-awful if/else construct or switch statement that's as long as your arm. The actual core of the implementation is perhaps 8 lines of code. The rest includes a lot of setters and getters and other 'non-breaking' code.

Wednesday, April 18, 2007

TAS Must Die, Chapter 15

While spelunking with Iscript syntax, I noticed that the expression on an IF statement is fairly robust with logicals and parentheses. Once you can have nested parentheses, you have to support a stack of some sort or you just can't parse it properly. I did the expression parsing in P1 using a recusrive descent parser. I wasn't in the mood to remove the left recursion from Iscript's more complex conditional statements so I decided to implement an operator precedence parser.

The typical OPP, as per the link, includes a table of precedences. This tells you if 1+3*5 should be treated as (1+3)*5 or 1+(3*5).

I've done a few prototypes now to refresh my memory. It occured to me that I was implementing an algorithm in java using the exact same architecture I used many years ago in c. And it was clunky. So I scrapped it - I decided that the table of precedences just doesn't work as well in java as it does in C.

I'll likely replace the traditional grid with a class that is initialized with Token-on-stack and Token-on-input pairs. Each pair will resolve to a ParserAction class which guides the parser. Java doesn't have a very nice initialization mechanism so it will be annoying. But the part that does the actual parsing will be lean.

Monday, April 16, 2007

TAS Must Die, Chapter 14

I started creating the P2 parser this weekend. I took a delightfully neaderthal approach which was to write just enough code to dump the iscript tokens to sysout, find a cohesive batch of them (like the SET statement, for example) and code it up.

It didn't take long to get REM, SET, and INCLUDE mostly working. I suspect as I feed different iscript files through the parser I'll find different cases that the parser doesn't handle. I don't actually have any documentation for iscript. I have no clue as to the full extent of the supported syntax. That's made maintaining it for these last 2 years pretty invigorating at times.

What's interesting about this phase of the project is designing the pcode. I think of pcode as über assembly language. Each pcode statement will do a specific task and contain all the information to do so. For example, of I know a token is a number, I'll probably mark it as such in the pcode.

|set|someVariableName|314|

The pcode interpreter could figure out of 314 is an integer or a variable. But I've already done that in the lexer and parser. So why not have

|set|someVariableName|314,contant,integer|

I don't know if the pcode interpreter will need all the information, but if I have it I might as well provide it.

I'll probably start writing the Factory in parallel. The factory will accept a list of pcodes and return a list of executable classes. This will allow me to execute the pcode - and check my work.

Sunday, April 15, 2007

Adventures in Anodizing Aluminum

My wife told me to get a hobby or die. So my friend and I are building a tube amp - a copy of the venerable Fender 5F1 circuit. First, it is amazing how involved it's been to convert that schematic to a real thing. There are the parts called out for by the schematic, but then the nuts, bolts, bushings, washers, wire, mounting hardware, you name it. It's been eye opening and educational which is the whole point of the project. I could buy a vintage amp for less than I spent making this one, lol.

Now, we're making everything that a couple of hacks can make. The control panel plate is the thing the guitar plug and volume knob are mounted to. We made ours from aluminum. It's what we had. But aluminum is soft and scratches easy. I could could spray it with a Krylon enamel rattle-can, but that's boring and I have spent a LOT of time on this amp. I don't want to cheese out now. We decided to anodize it.

Aluminum rusts, just as steel does. That's why el-cheapo lawn furniture can turn your hands black. Anodization is a process where the aluminum gets a different type of rust than normal. The anodized surface is a thin coating that is almost as hard as diamond. Remarkably, the basic process is simple and as safe as anything that involves sulphuric acid and a battery charger.

It's a three-step process.

First first step is where we run current from the battery charger through a lead cathode (- side of the charger) through the acid, through the aluminum parts to be anodized, and to the anode (+ side) of the charger. This causes a layer of aluminum hydroxide to form on the aluminum parts.

The second step is where you drop the now-anodized aluminum part into a bath common clothing dye. The aluminum hydroxide surface has microscopic pores that are open. The dye can sneak right into those pores.

The third stage is where you boil the parts in water. The heat makes the pores close, trapping the dye and hardening the surface which, magically, becomes a 'hydroxide monohydrate' which I do not pretend to understand.

All this, and you can't really over-do it. As you apply the current to the acid bath, the oxide layer starts forming on the parts. This layer does not conduct electricity well! As the layer gets thicker and thicker, it conducts electricity less and less. So the process is self-limiting. After a while, electricity can't flow, and it all just... stops. Crazy. The Lord provides.

Armed with our knowledge, we tried an experiment yesterday. We produced lovely un-anodized aluminum and burned our anode hangers in half. Funny, but really, I didn't expect total failure. I thought we'd get a mediocre result and have to make an adjustment. We did deviate from The Plan a little, so next time we'll follow the instructions better.

Here is an example of some parts anodized by a professional. Look around, you're probably surrounded by anodized aluminum. Stay tuned.

Saturday, April 14, 2007

TAS Must Die, Chapter 13

Here's where I'm heading. The square boxes below are programs. The ovals are lexers or other output files. The round-corner boxes are input files that will be consumed by lexers.

L1Gen is the hand-coded lexer generator. It produces L1, a lexer for lexer descriptions. L1 produces the tokens that allow P1 to read a lexer description and produce a lexer. L2 is a lexer that tokenizes iscript files. P2 is a parser that will use L2, ingest an iscript source file, and produce pcode that is the logical equivalent of the iscript source file. The TASMustDie servlet will ingest the pcode and produce HTML which is lobbed at the browser. I expect the servlet will be non-trivial and implement the Command and Factory design patterns. Basically, the servlet sucks in the pcode, rips through it by line, and invokes the Factory to produce Command instances. Now we have an executable version of the textual pcode.

Everything through and including L2 is working as far as I can tell. So now I should start writing P2 and designing the Pcode. Iscript isn't very tricky until we get to the point where we have to communicate with the Tandem. Then the syntax gets a little different. So I probably have more lexer work in my future. For all that I don't forsee and difficulties. I mean, what could possibly go wrong??!

I'm taking a breather now. So far, the lexer generator is working. I'll probably create a pretty brutal test of it soon. This will involve creating a iscript lexer (L2 in the diagram) using L1Gen's lexer, and creating an alternative to L1Gen's using the third iteration of a bootstrapped lexer. I can probably run 10,000 tokens through each one, easy. If the output streams aren't the same, there's a problem.

Friday, April 13, 2007

TAS Must Die, Chapter 12.

Just a quick note. Today I was able to bootstrap the lexer-generating parser. That is, the parser can now generate a lexer which it can use to generate an identical lexer. At one pass of the optimization phase, the lexer flies up to about 5,500 states then finally optimizes down to about 50. I have no idea why it's producing all those temporary states. The hand-coded lexer generator uses the same backend calls and doesn't seem to have any difficulties.

I suspect I'll be revisiting this at some inconvenient time.

Thursday, April 12, 2007

TAS Must Die, Chapter 11

More on design patterns. I have this graph of states, and I need to navigate it for multiple reasons. Sometimes I want to count the nodes. Sometimes I want to group them to remove redundencies. Sometimes I just want to generate a printable version. Or generate the lexer.

Say you want to print the state diagram. Normally what happens is that you write the navigation routine and right there in the middle of it, you put your "print" function. You get that working. Then you decide to count the items in the data structure. The navigation code you already have has the print stuff buried in the heart of it. That's no good for counting. So you copy/paste the navigation code and replace the "print" stuff with "count" stuff. About the 3rd time you do this, you realize it sucks and you want a different solution. In C, you'd pass a pointer-to-function as an argument. You'd navigate and then call the function to do the task-specific work. Pretty good stuff, but maybe a little fast-and-loose. As normal with C, you're at the mercy of the programmer's skill, mentality, and deadline.

Java has no pointer-to-function language structure. But using the Visitor design pattern we can cook up something that acts the same and has a little type checking in it. The wikipedia link is very good, but I'll paraphrase.

1. Create an interface called Visitor that has a visit(SomeObject) method, where SomeObject is germane to your problem domain. For this project, I had a graph of states, so my method was


public Interface Visitor {
void visit(State s);
}


2. To the class that manages the navigation data structures, add the following:


void accept(Visitor v) {...}


This is the class that manages navigation. For a binary tree of type node, you might have


public class Node {
// stuff
public void accept(Visitor v) {
if (leftChild != null) leftChild.accept(v);
v.visit(this);
if (rightChild != null) rigthChild.accept(v);
}
}


Rock solid. Now, accept() is totally divorced from what the Visitor is actually doing. Visitor is an interface so by definition there's no implementation and no way for accept() to make any assumptions. A particular Visitor could generate a printable Node, count the nodes, search for a match, anything.

In my case, since my graphs (and states) are managed by the Nfa class, I implemented the method there.

3. Create an implementation of Visitor. Add the function-specific code to the visit method.


public class CountVisitor implements Visitor {
private int count = 0;
public void visit(State s) {
count++:
}
public int getCount() {
return count;
}
}


4. Add any auxilliary methods to the Visitor implementation as needed, such as getCount() above.

Here's an example from my project.



When I generate my lexer, it's important that my states be numbered starting at zero. For efficiency, the state id will be used as an index into an array! Given that I am creating and destroying hundreds of states as I convert the NFA to a DFA, the state IDs are all over the place. As I navigate my graph of states, the visit(State) method is called once for each state. The visit method simply adds a reference to the state to an Arraylist. Now I know all the states.
When I am done, I call the renumber() method and just iterate over the states, reassigning their IDs. To use the visitor, the following does the trick:

Visitor v = new StateRenumbererVisitor();
nfa.accept(v);
v.renumber();

It can be hard to get your mind around this. Once you do it, however, you'll be hooked.

University of Kentucky

The family visited the University of KY today. My son is in the process of selecting a school. So we got an appointment for a group-tour/dog and pony show. I went to UK in the early 80's, I figured I go down there and show him around.

Wrong.

The place is totally different now. I suppose I shouldn't be surprised after 25 years, but holy cow. So many more buildings, so much more everything. Plenty of green space still. The huge concrete area outside the Patterson tower is gone, replaced by landscaped green space. There's a new student center, even newer than the one they opened in 84. The old grilles are gone, replaced by chain restaurants. The dining area in the old student center is huge. The new library is magnificent, multiple floors, superblytasteful archtecture, open and airy. And the new engineering building is impressive in its own way - it looks like a modern take on an old factory. But with lots of glass. And there's a "gym" where no classes are held - it's stricty for students' health. It sports a 30' rockwall, large indoor running track, all manner of weights, and a single room with four side-by-side full-size basketball courts. It's huge. There's also an outdoor track and an aquatic center. And the programs my son is interested in (some manner of business) are very hightly rated.

Eye opening trip to my old stomping grounds. Man, I'm obsolete.

Wednesday, April 11, 2007

TAS Must Die, Chapter 10b

I figured I'd goof off with the Decorator pattern. It didn't take long to find a more satisfying solution to the problems outlined previously.

The gist of the Decorator pattern is that you have potentially several classes each implementing some interface X. They each accept and X as an input stream. So you can daisy-chain them together as needed.

IntX t1 = new GetWords(); // GetWords implements IntX
IntX t2 = new Trim(t1); // Trim implements IntX
IntX t3 = new Upper(t2); // Upper implements IntX
loop();
private void loop(IntX q)
while(true)
{
System.out.println(q.read());
}



So as we buzz in the while loop, we read from t3, t3 reads from t2, and t2 from t1. I don't know where t1 gets its input :-) Assume t1 returns " now". t2 trims it to "now". t3 converts it to "NOW". So the loop prints trimmed uppercase words. If you decided you needed to enhance this to produce German, you could create a new IntX decorator that sipped from t3 and performed the translation. A nice thing about this is that you can reorder the IntX's (sometimes it matters), add new ones, or remove some, without any negative impact on your code. Note that the loop accepts the IntX interface - it has absolutely no clue what's going on behind the scenes in "q". That's a good thing.

So, thumbs up for the decorator pattern.

Later I'll talk about another nice pattern, the Visitor. This decouples the navigation in the NFA from some action to do on those states.

TAS Must Die, Chapter 10

I started cleaning up some cruft I left behind in the code. Details, nothing worth mentioning.

Now I use the lexers I generate in two different places. One use is by P1, the parser that produces lexers as its output. P1 interprets common regular expressions and thus uses a lot of operators:

variable: [a-z]([a-z]|[0-9])*

for example. [, -, ], |, (, ), and * are all operators of some sort. Now sometimes you'll want to interpret one of these characters as a common boring character, not as an operator. In this case, I escape it with a backslash:

copyright: \(c\)

for example.

The other use of the lexer is just to read a bunch of iscript. I don't need a lot of escape-type processing. But it would be nice if the lexer ignored whitespace, devoured extraneous new lines, and handled windows newlines.

I've tried a few times to create a class that did both tasks well. It was a failure. They are pretty different. So I tried creating a base class that did very little except implement a pushback stack so I could unread tokens. It's helpful sometimes. I then subclassed two different TokenReaders, onefor the parser and it's crazy operators, and one for more mundane iscript processing.

My result, while it works, is not satisfying. It's clunky and brittle.

So combining it all was terrible, subclassing was unsatisfying. Within the next few nights I'll implement a Decorator design pattern. The gist of this is that you have two classes that implement some interface. You wrap one object in with another. You query the latter in your application. It invokes the wrapped object as necessary, and applies some type of transformation or filter.

Monday, April 9, 2007

TAS Must Die, Chapter 9

All the bugs that I know of are gone now. I haven't taken the 'bootstrap' plunge yet. That being said, it's been a few days since I had to rebuild a base lexer using the L1Gen lexer program.

I'm lexing a large part of the TAS Iscript input files now. Now I'm at the crossroads of

1. Writing a parser for the tokens so I can convert iscript into something else. That's ultimately what this project us about. I previously hand-coded "P1", a recursive-descent parser that produces DFAs from the lexical rules. It was surprisingly easy. I'll probably do the same for the iscript parser. Easy Peasy.

2. Writing a parser generator. Then I could describe iscript's structure using BNF or something and have the parser be generated. This is another favorite topic for me. Recent experiments of turning BNF into a Greibach Normal Form have been very successful.

3. Finding a parser generator for Java, similar to yacc. Where's the fun in that? If I was doing this project as part of my job, I'd be all over it. But I'm not so I'm not.

4. Cleaning up what I have, because it will probably bite me anyway. For example, I still don't handle newlines and other control characters well. The subtelty of this surprises me. If I started all over, I'd probably start with control (and escaped) characters first.

5. Stopping. I've accomplished my first goal and learned a lot.

I'll probably bounce between #1 and #4, addressing weaknesses of the lexer generator as needed. And after a few visual observations, it's no good to have the some manner of p-code being produced without doing anything with it, so I'll probably start building a servlet to interpret said p-code.

This has been quite a learning experience, I need time to breathe now.

Friday, April 6, 2007

TAS Must Die, Chapter 8

My 'go for the throat' coding approach, where I implement the big ideas at the expense of the details, is starting to catch up with me. For example, I currently don't have a good way to specify control characters (such as newline) in my rules. Using the common "\n" yields a Nfa that recognizes an 'n'. The backslash is works great for converting lexer tokens

"(" => OPEN_PAREN, "("
"\(" => CHAR", "("

but doesn't work so well for n and r.

Other issues involve the lexer and the file it processes. It doesn't handle EOF very gracefully when the last line of the input file isn't terminated in a newline. I suppose it does exactly what it should, from a logical point of view, but that diverges from what I want it to do, which is to insert a NEWLINE at the end of that line, and then produce the EOF. So I have to wonder if I should rig the lexer to produce BOF (BOL stuff EOL)* EOF automatically since that's what I would really like to see. So I'm thinking I could feed the lexer a raw input stream and a cooked input stream, with the latter inserting the file markers when they are absent. BOL and EOL as explicit tokens give me a nice way of matching strictly-formatted files. The TAS Iscript stuff I intend to eventually parse includes commands that always start at the beginning of the line, at least at our installation. So BOL would give me a nice way to match this.

I am gaining an appreciation for the realities of file I/O, control characters, when to ignore whitespace, etc.

It's getting to the point that I am considering adding another lexer that will be tasked with removing whitespace and inserting file markers. It would be enabled by configuration. When
enabled it would act as a filter on the real token stream.

I'm also flirting with the idea of adding the "not" operator which inverts FAs. In my tiny world, A -> not x -> B produces a FA where A has 255 transitions to B - every possibility except 'x'. Implementing this in the general case (where 'x' in the example could be a FA), however, is not straightforward and maybe not useful. More to come on this.

Finally, I have implemented the multiple-lexer idea. I am not sure it is such a great idea though I don't have a better alternative at this point. I did implement a switch in the lexer generator code to treat characters in a case-insensitive way. This means that instead of adding a 'x' to a FA, I add ('x'|'X'). Remember, if brute force doesn't work, you aren't using enough of it!

Thursday, April 5, 2007

TAS Must Die, Chapter 7b

Reviewing the last entry, the behavior of the two parallel structures on the right could be obtuse. Basically, each state owns a section of these arrays as delimited by begEdge and endEdge. Each row within that range contains a character and the state to go to if we see that character.

That's nice and compact. However, it does constitute a linear scan which could be relatively slow. It's probably immeasuable in any real case. Be that as it may, if the lexer needed to be faster we could replace the parallel edge and dest arrays, as well as the begEdge and endEdge arrays, with a gawd-awful sparse matrix. It would take up more space and probably not be very i18n-able. However, we'd be able to generate tokens about as fast as it could be done.

The table would be a nx256 int array where n is the number of states in the lexer. The 256 ints for a state would represent the transitions possible. The ASCII value of the input character would be an index into this array. The result would be the next state.

nextState = 0; // start state
while(nextState != -1)
{
input = getInput();
state = nextState;
nextState = sparse[state][input];
}

// 'state' is where we finished, and 'input' would
// be the unaccepted
character that we'd re-ingest
// next time.



Not a lot of code there...

Another option is to go the other way, and say all this array stuff is rubbish. The lexer would include normal classes which we'd instanciate the first time the lexer was used. We could replace the four parallel arrays with a simple class and the two parallel edge arrays with a Map. Instead of having state IDs or Token IDs, etc, we'd use references to instances in our classes.

Wednesday, April 4, 2007

TAS Must Die, Chapter 7

While I'm procrastinating on making the final changes to the lexer, I thought I'd post the lexer's architecture. I wrote it in a frenzy the other day and I almost forgot how it works.


To the left is a variable named 'currentLexer'. This points to the start state of the currently active lexer. When we call the lexer for a token, 'currentLexer' tells us the state to begin with.

This refers directly to the four parallel arrays in the center of the diagram. Any row across them represents a single state in the lexer. 'LexerJump' will, upon succesfully lexing a token, change the current lexer. This facility allows lexer designers to create additional lexers to help resolve lexical conflicts. If while scanning the input we can't make any progress, TokenString will tell us if we're in a stop state and which token should be returned. BegEdge and EndEdge identify the rows in the 'edge' tables that are associated with the current state. If the current state transitioned on a, b, and c, there would be 3 rows in the 'edge' tables. BegEdge would point to the first row and EndEdge would point to the last row.

To the right are two more parallel arrays. A single row represents an input character and the state we advance to if we see that character. EdgeInputs contains a character. The paired EdgeDests row tells us the state we advance to upon scanning that character.

Finally, the Tokens array at the bottom holds all the 'class'es that can be returned by the lexer. It is used by the TokenString array.

I've refered a few times to 'multiple lexers' and such. Excepting 'currentLexer' and 'LexerJump', the architecture really doesn't know about lexers at all. There are states, their inputs, and the tokens generated. So stacking several DFAs/lexers into these arrays works fine. The only thing we have to do is to add a mechanism to allow us to point to the first state of any DFA/lexer, and that's what 'currentLexer' and 'LexerJump' do. Previously, instead of a 'currentLexer' variable, I just started at state 0 and there was no concept of more than one lexer.