Tuesday, December 16, 2014

String Theory of Computing

Team. Many modern-day physicist believe that a multidimensional unit of existence that they call a "string" is the fundamental building block of the universe. Before, one becomes an accomplished physicist one must master "strings". There is a parallel in computing, software engineering, and traditional data processing. Thankfully, this parallel is not nearly as complex. Remember, we believe in keeping things simple; physicist lost that philosophy in their profession with the advent of  the numerous counter-intuitive theories which arose after Einstein turned Newton's classical physics of the macroscopic, sub-lightspeed world on its ear.

In the world of computing, a string is simple a sequence of letters, numbers, or other symbols. These weblog posts are simply an extended string. Strings have the property that each sub-portion of a string is also a string that we call a sub-string. For instance, the word "Team" which starts this post is a sub-string at the head of the string which represents the first paragraph. A sub-string is simple a small contiguous portion of a larger string which is a string itself.

More abstractly, the primary and secondary storage devices on any electronic computing device hold a string of zeroes and ones represented by electromagnetic signals of varying intensity. Depending on the collating sequence, collections of these electronic states grouped by eight, sixteen, thirty-two, or sixty-four represent the various symbols which a computer can manipulate. A collating sequence is simply a mapping between a sequence of zeroes and ones of a "given, fixed" sized and symbols.

One of the earliest, common, and standard sequences used was EBCDIC which stands for Extended Binary Coded Decimal Interchange Code. It is still in common use by many mainframe computers.

Another popular and common sequence is ASCII, the American Standard Code for Information Interchange. It is a standard that has been used with mini- and micro-computers since the early sixties. ASCII supports representing one hundred and twenty-eight symbols including the English alphabet, 'a' - 'z' and 'A' - 'Z', plus the digits, '0' -'9'.

As computers expanded in their capabilities and had needs for expressing more complex and diverse strings, the American National Standards Institute (ANSI) developed Extended-ASCII. This list supported two-hundred and fifty-six symbols.

It is important that it be emphasized that the word "sequence" has been used in more that one way: firstly, describing strings, and, secondly, describing the list or tables which hold mappings between a sequence zeroes and ones, called a binary number, and a symbol. It should also be added that the binary numbers in these tables often appear in related forms such as an octal or hexadecimal number which are more readable and compact. One can easily convert between binary and octal or hexadecimal by simple grouping zero and one symbols in sets of three or four digits and converting between bases.

Finally, in the most recent years of computing, researchers at Xerox PARC in Northern California sought the production of a universal collating sequence which could represent any symbol in any language, whether current or historical, plus provide room for expansion in the future. This collating sequence is known as Unicode and supports over one hundred thousand symbols currently. As with general purposefulness, universality is a common goal found in computing.

Strings are everywhere in computing. This is our first axiom of string theory. The Ubiquity Axiom. For this theory, a string is like a point in geometry.

Our second axiom is that in essence, computing is the passing of strings between data sources and data targets (sinks - for those of you who think in terms of discrete graphs and abstract networks) . This is our Axiom of Transfer.

The next axiom, our Axiom of Transformation, states that while in transfer computational modules might mutate a string as needed.

Our fourth and last axiom holds that the final state of any computation is a transfer-able string. This is the Axiom of Invariance.

Clarifying CABOOSE, its primary goal is simplifying the string processing and passing which occurs in layered type-II model-view-controller web applications. As a software engineer, whenever you become overwhelmed by the technical nature of the computing task which you are performing remember that ultimately you are only moving a string between point A and point B. Software development frequently can be muddled by unfamiliar technical terms, domain knowledge, and a learning curve. When puzzled by a new technological paradigm, language, or method of concerns partitioning, simply stop, identify the strings in the system, their sources, and targets, and research the ways in which one moves a string between each source and target in a programmer's guide, language reference, or tutorial at a quality developer's internet site. There are many developer resources on the web such as the Oracle Technology Network (OTN), the Microsoft Developer's Network (MSDN), perl.org, php.net, pyhton.org, plus many more. It is best if you seek development advice from the originators of the technology which you might be using.

Experience in scientific and business computing has taught much. From scripting in Magical on a Varian 500 Mhz Nuclear Magnetic Resonance Spectrascope, report writing on retired Wang mini-computers using the Professional Application Creation Environment (PACE) and its fourth generation language, building a computer telephony integration device using C, UNIX and a Dialogic API, providing decision support services with COBOL in OS/360 and OS/390 environments, scripting common gateway interface processes during the early days of the web using HTML, JavaScript, C, UNIX, and embedded SQL, creating business support applications in VB6 and VB.NET, increasing the sales of small businesses using JAVA, creating web applications in PERL, PHP, and Python, plus teaching all of these skills and more, I am convinced that at the end of the day the summation of all of the efforts of the modern computing professional simply is the passing of strings between source and sink.

Please pardon the longer post. Occasionally, making a concept "plain" requires an extended explanation. Hopefully, this post has helped you decipher what we are doing with CABOOSE.

Concepts are rarely ever over-simplified. We will also examine some of the other mysteries in modern computing with this Wednesday post, The "Number Theory of Computing".

The redacted source code for the progress on the protoype which we have made will be posted this Friday. La-La.



No comments:

Post a Comment