papers
+HCU papers
courtesy of Fravia's page of reverse engineering

   SLH

   FUTURE VISION
   
   The supression and resurrection of assembler programming.
   
   Historical perspective
   ~~~~~~~~~~~~~~~~~~~~~~
   A long time ago in world far away, the designers of the millenium bug
   scribble up flow charts in ancient ciphers like Cobol and Fortran, send it
   to the girls in the punch card room who dutifully typed out the punch cards
   and send the result to an air conditioned boiler room downstairs where they
   were loaded into a card reading machine and run overnight.
   
   This was a cosy corporate world of high profits, high salaries and endless
   accolades for being on the leading edge of the computer age. The writing on
   the wall started to be read when the oil price rises in the seventies took
   its toll on the profitability of the corporate sector and the cost of running
   and upgrading mainframes generated the need for a cheaper alternative.
   
   The birth of the PC concept was an act of economic necessity and it heralded
   a major turn in the computer world where ordinary people were within
   reasonable cost range of owning a functional computer. When Big Blue
   introduced the original IBM PC, it put a blazingly fast 8088 processor
   screaming along at 2 megahertz into the hands of ordinary people.
   
   The great success of the early PCs onwards was related to the empowerment
   on a technical level of ordinary people being able to do complex things in
   a way that they could understand.
   
   Early Programming
   ~~~~~~~~~~~~~~~~~
   The early tools available for development on PCs were very primitive by
   modern standards yet they were a great improvement over what was available
   earlier. If anyone has ever seen the electronic hardware guys writing
   instructions for eproms in hex, the introduction of a development tool like
   debug was a high tech innovation.
   
   PCs came with an ancient dialect of ROM basic where if you switched the
   computer on without a floppy disk in the boot drive, it would start up in
   basic. This allowed ordinary users to dabble with simple programs that
   would do useful things without the need of a room full of punch card
   typists and an air conditioned boiler room downstairs with an array of
   operators feeding in the necessary bits to keep a mainframe going.
   
   The early forms of assembler were reasonably hard going and software output
   tended to take months of hard work using rather primitive tools which gave
   birth to the need for a powerful low level language that could be used on a
   PC that would improve the output.
   
   C filled this gap as it had the power to write at operating system level
   and as the language improved, it had the capacity to write assembler
   directly inline with the C code.
   
   If the runtime library functions could not do what you wanted, you simply
   added an asm block,
   
       ASM
          {
            instruction ...
            instruction ...
            instruction ...
          }
   
   and compiled it directly into you program.
   
   As the tools improved from being driven by market demand, the idea of a
   common object file format emerged which dramatically increased the power
   that programmers had available.
   
   Different languages had different strengths which could be exploited to
   deliver ever more powerful and useful software.
   
   C had the architecture to write anything up to an operating system.
   Pascal had developed into a language with a very rich function set that
   was used by many games developers.
   
   Basic had emerged from its spagetti code origins into a compiler that
   had advanced capacity in the area of dynamic memory allocation and string
   handling.
   
   The great unifying factor to mixed language programming was the capacity to
   fix or extend the capacity of each language by writing modules in assembler.
   
   Modern Programming
   ~~~~~~~~~~~~~~~~~~
   By the early nineties, modern assemblers came with integrated development
   environments, multi language support in calling conventions and powerful
   and extendable macro capacities which allowed high level simulations of
   functions without the overhead associated with high level languages.
   
   To put some grunt into a deliberately crippled language like Quick Basic,
   you wrote a simple assembler module like the following,
   
   ;--------------------------------------------------------------
   
               .Model  Medium, Basic
               .Code
   
       fWrite  Proc handle:WORD, Address:WORD, Length:WORD
   
               mov ah, 40h
               mov bx, handle
               mov cx, Length
               mov dx, Address
               int 21h
   
               ret         ; Return to Basic
   
       fWrite  Endp
   
       End
   
   ;--------------------------------------------------------------
   
   Change the memory model to [ .Model  Small, C ] and you had a printf
   replacement with one tenth the overhead.
   
   Code as simple as this allowed you to write to files, the screen or a
   printer, just by passing the correct handle to the function.
   
   Simply by specifying the calling convention, the programmer could extend
   C, Pascal, Basic, Fortran or any other language they wrote in so that it
   would deliver the capacity that was wanted.
   
   This capacity was nearly the heyday of flexible and powerful software
   development in the hands of non-corporate developers. The monster looming
   on the horizon came to fruition as a consequence of corporate greed on one
   hand and expediency on the other.
   
   The Decline
   ~~~~~~~~~~~
   Legal wrangling about the ownership of UNIX in the early nineties crippled
   its development for long enough to leave the door open for the early version
   of Windows to gain enough popularity to be further developed. With the
   advent of the upgraded version 3.1, DOS users had a protected mode, graphics
   mode add on that offered extended functionality over the old DOS 640k limit.
   
   The great divide started by stealth, development tools for version 3.1 were
   thin on the ground for a long time, the technical data necessary to write
   protected mode software was proprietary and very expensive.
   
   Even after parting with a reasonably large amount of hard currency, the
   version of C and the SDK that was supposed to be the be all and end all
   came with a development environment that crashed and crashed and crashed.
   The documentation could only be classed as poor and it dawned on most who
   bothered that the proprietor couldn't care less.
   
   The sales were their and they no longer needed the little guys who supported
   them on the way up.
   
   The Fall
   ~~~~~~~~
   Over the duration of 16 bit windows, the little guys made a reasonable
   comeback and produced some often very good and reliable software but the
   die had been cast. The reigns of proprietry control drew tighter and
   tighter while the support for the expensive software became poorer and
   poorer.
   
   The problem for the corporate giants was that the world was finite and
   market saturation was looming over their head in the very near future.
   Their solution was to gobble up the smaller operators to increase their
   market share and block out the little guys by controlling the access to
   the development tools.
   
   The Great Divide
   ~~~~~~~~~~~~~~~~
   Many would say, why would anyone bother to write in assembler when we have
   Objects, OLE, DDE, Wizards and Graphics User Interfaces ? The answer is
   simple, EVERYTHING is written in assembler, the things that pretend to be
   development software are only manipulating someone elses assembler.
   
   Market control of the present computer industry is based on the division of
   who produces useful and powerful software and who is left to play with the
   junk that is passed off on the market as development tools.
   
   Most programmers these days are just software consumers to the Corporate
   sector and are treated as such. As the development tools get buggier and
   their vendors spend more time writing their Licence Agreements than they
   appear to spend debugging their products, the output gets slower and more
   bloated and the throughput of finished software is repeatedly crippled by
   the defects in these "objects".
   
   A simple example of market control in development software occurs in the
   Visual Basic environment.
   
   Visual Basic has always had the capacity to pass pointers to its variables.
   This is done by passing the value by REFERENCE rather than by VALUE. The
   problem is that the VB developer does not have access at the pointer and
   has to depend on expensive aftermarket add ons to do simple things.
   
   Visual Basic has been deliberately crippled for commercial reasons.
   
   This is something like downloading and running a function crippled piece
   of shareware except that you have already paid for it. There are times
   when listening to the hype about enterprise solutions is no more than a
   formula for an ear ache.
   
   Why would a language as powerful as C and its successor C++ ever need to
   use a runtime DLL. The answer again is simple, programs that have a startup
   size of over 200k are not a threat to corporate software vendors who are
   in a position to produce C and assembler based software internally.
   
   The great divide is a THEM and US distinction between who has the power to
   produce useful things and who is left to play with the "cipher" that passes
   as programming languages.
   
   In an ideal world, a computer would be a device that knew what you thought
   and prepared information on the basis of what you needed. The problem is
   that the hardware is just not up to the task. It will be a long time into
   the future before processors do anything more than linear number crunching.
   
   The truth function calculus that processors use through the AND, OR, NOT
   instructions is a useful but limited logic. A young German Mathematician
   by the name of Kurt Godel produced a proof in 1931 that axiomatic systems
   developed from the symbolic logic of Russell and Whitehead had boundaries
   in their capacity to deliver true statements.
   
   This became known as "The indeterminacy of Mathematics" and it helps to put
   much of the hype about computers into perspective. The MAC user who asks
   the question "Why won't this computer do what I think" reveals a disposition
   related to believing the hype rather than anything intrinsic about the
   68000 series Motorola processors.
   
   Stripped of the hype surrounding processors and operating systems leaves the
   unsuspecting programmer barefoot, naked and at the mercy of large greedy
   corporations using their market muscle to extract more and more money by
   ruthlessly exploiting the need to produce software that is useful.
   
   Computer processors into the foreseable future will continue to be no more
   than electric idiots that switch patterns of zeros and ones fast enough
   to be useful. The computer programmer who will survive into the future is
   the one who grasps this limitation and exploits it by learning the most
   powerful of all computer languages, the processors NATIVE language.
   
   The Empire Fights Back
   ~~~~~~~~~~~~~~~~~~~~~~
   The Internet is the last great bastion of freedom of thought and this is
   where the first great battle has been won.
   
   The attempt to make the Internet into a corporate controlled desktop has
   been defeated for the moment. Choose your browser carefully or you may
   help shoot yourself in the foot by killing off the alternative.
   
   Control of knowledge is the last defence of the major software vendors and
   it is here where they are loosing the battle. The Internet is so large and
   uncontrollable that the dispossessed who frequent its corridors have started
   to publish a vast array of information.
   
   Assembler is the "spanner in the works" of the greedy corporate sector.
   There are some excellent technical works that have been contributed by
   many different authors in the area of assembler. The very best in this
   field are those who have honed their skills by cracking games and other
   commercial software.
   
   It should be noted that the hacking and cracking activities of the fringe
   of computing is a different phenomenon to cracking games and commercial
   software protection schemes. The fringe play with fire when they attack
   security information and the like and complain when they get their fingers
   burnt. The attempt by the major software vendors to label the reverse
   engineering activities in the same class as the fringe is deliberate
   disinformation.
   
   These authors are at the leading edge of software research and like most
   highly skilled people, their knowledge is given out freely and is not
   tainted by the pursuit of money. It comes as no surprise that the corporate
   sector is more interested in supressing this knowledge than they are in
   supressing the WAREZ sites that give away their software for free.
   
   The Comeback Trail
   ~~~~~~~~~~~~~~~~~~
   Start with the collection of essays by the +ORC. You will find an incisive
   mind that gives away this knowledge without cost. Start looking for some
   of the excellent tools that can be found on the Internet ranging from
   dis-assemblers to software in circuit emulators (SoftIce).
   
   There are some brilliant essays written by _mammon on how to use SoftIce
   which are freely available.
   
   Dis-assemblers are a supply of enormous quantities of code to start
   learning how to read and write assembler. The best starting point is the
   nearly unlimited supply of DOS com files. This is for good reason in that
   they are simple in structure being memory images and are usually very
   small in size.
   
   The other factor is an eye to the future. COM files are an escapee from
   early eighties DOS programming where most PCs only had 64k of memory. This
   meant that they are free of the later and far more complex segment
   arithmetic that DOS and 16 bit Windows EXE files are cursed with.
   
   The emerging generation of 32 bit files are called Portable Executables and
   they are written in what is called FLAT memory model where there is no 64k
   limit. COM files were restricted to 64k absolute but could directly read
   and write anything in their address space.
   
   A portable executable file has a very similar capacity except that in 32 bit
   it can theoretically read and write anything within a 4 gigabyte address
   space. In a very crude sense, PE files are 32 bit COM files but without
   some of the other limitations.
   
   A very good dis-assembler for COM files is SOURCER 7. Particularly in the
   early stages of exploring the structure of COM files, its capacity to
   add comments to the reconstructed source code make the code much easier to
   read.
   
   To start making progress, you will need an assembler. Although they are
   getting a bit harder to get, you can still source either MASM or TASM and
   start writing your own COM files. The generic "Hello World" example comes
   with a lot less code than many would think.
   
   ;----------------------- Hello.ASM ----------------------------
   
   com_seg segment byte public            ; define the ONLY segment
   
           assume cs:com_seg, ds:com_seg  ; both code & data in same segment.
           org 100h                       ; go to start adress in memory.
   
   start:
           mov ah, 40h                    ; the DOS function number.
           mov bx, 1                      ; the screen handle.
           mov cx, 11                     ; the length of the text to display.
           mov dx, offset Greeting        ; the address of the text.
           int 21h                        ; get DOS to execute the function.
   
           mov ax, 4Ch                    ; the TERMINATE process function.
           int 21h                        ; call DOS again to EXIT.
   
   Greeting db "Hello World",13,10        ; specify the text as byte data.
   
   com_seg ends                           ; define the end of the segment.
   
           end start
   
   ;----------------------------------------------------------------
   
   This tiny program assembles at 31 bytes long and it makes the point that
   when you write something in assembler you only get what you write without
   a mountain of junk attached to it. Even in C, putting printf in a bare
   Main function with the same text will compile at over 2k. The humourous
   part is if you dump the executable, printf uses DOS function 40h to output
   to the screen.
   
   Once you assemble a simple program of this type, immediately dis-assemble
   it and have a look at your program as it has been converted from binary back
   to code again. This will train your eye into the relationship between your
   written code and the results of dis-assembly.
   
   This will help to develop the skill to dis-assemble programs and read them
   when you don't have the source code. Once you start on the mountain of DOS
   com files available, you will find that much of the code is very similar to
   what you have written yourself and you get to see an enormous quantity of
   well written code that you can learn from without having to pay one brass
   razoo for the privilege.
   
   Some people are slightly bemused by the +ORC's reference to Zen yet if it is
   understood in the sense that the human brain processes data at a rate that
   makes fast computer processors look like snails racing down a garden bed,
   you will start to get the idea of "feeling" the code rather than just
   munching through it like a computer does.
   
   As you read and write more code your brain will start "pattern matching"
   other bits of code that you have already digested and larger blocks of
   code will start to become very clear.
   
   Once you go past a particular threshold, the process of "data mapping" and
   "model fitting" starts to occur. This is where you know enough to project
   a model of what is happening and then test it to see if it work the way
   you have modelled it. The rest is just practice and a willingness to keep
   learning.
   
   Once you get the swing of manipulating data in assembler, you will start to
   comprehend the power and precision that it puts in your hands. Contrary to
   the "buzz word" area of software where logic is couched in "Boolean" terms,
   the foundation of logic is called "The law of excluded middle". In layman's
   terms, something either is or it ain't but it can't be both.
   
   George Boole and others like Augustus De Morgan developed parts of logic
   during the nineteenth century but it was not until Russell and Whitehead
   published "Principia Mathematica" shortly before the first world war that
   logic became a complete and proven system. Russell based much of this
   milestone in reasoning on a very important distinction, the difference
   between EXTENSIONAL and INTENSIONAL truth.
   
   Something that is spacio temporally "extended" in the world is subject to
   the normal method of inductive proof where things that are "intension"
   cannot be either proven or disproven.
   
   Logic had been held back for centuries by the assumption that it was a
   branch of metaphysics where Russell and Whitehead delivered the proof
   that logic is "hard wired" into the world.
   
   This is important to computing in very fundamental ways. The devices that
   programming is about controlling are very "hard wired" in the way that they
   work. Recognising the distinction between what the devices ARE as against
   what some would like them to BE, or worse, the bullshit that is peddled
   to an unsuspecting public about the "wonders" of computers, and you have
   made one of the great leaps forward.
   
   The devices are in fact very powerful at manipulating data at very high
   speed and can be made very useful to the most poweful of all processors,
   the conceptual apparatus of the brain using it.
   
   The only reason why this distinction has ever been inverted is through the
   greed and arrogance of corporate software vendors and their desire to
   extract yet another quick and dirty buck.
   
   In this sense, the large commercial vendors are in the same class as the
   proliferation of low class smut vendors clogging up the Internet, they lure
   with the promise of fulfilling the "lusts of the flesh" yet when they
   extract the money that they are after, they leave their victims both poorer
   and unfulfilled.
   
   Most ordinary computer users part with reasonably large sums of money when
   they buy a computer and load it with software yet the promise of fun,
   convenience and usefulness is more often than not followed by the bugs,
   crashes and defects in the software. A Faustian bargain where the hidden
   cost is not seen until the money is handed over.
   
   The EXIT clause for the programmers who are wise enough to see that their
   skills are being deliberately eroded for marketing reasons is the most
   powerful tools of all, the direct processor instructions that assembler
   puts in their hands.
   
   The time to make the move to learning assembler is not open ended. DOS is
   a passing world and without the background of starting in a simpler world
   that DOS has given us for so many years, the move to assembler will be
   much harder and less accessible. There is probably only a couple of years
   left.
   
   If you are not robust enough to use the +ORC's formula for martinis, pure
   malt has a glow to it that surpasses all understanding.
   
   FUTURE VISION
   

papers
+HCU papers
redhomepage red links red anonymity +ORC redstudents' essays redacademy database
redantismut redtools redcocktails redjavascript wars redsearch_forms redmail_Fravia
redIs reverse engineering illegal?