Base64 and ASCII both made perfect sense in terms of their requirements, and the future, while not fully anticipated at the time, is doing just fine, with ASCII being now incorporated into largely future-proof UTF-8.
Considerably stranger in regard to contiguity was EBCDIC, but it too made sense in terms of its technological requirements, which centered around Hollerith punch cards. https://en.wikipedia.org/wiki/EBCDIC
There are numerous other examples where a lack of knowledge of the technological landscape of the past leads some people to project unwarranted assumptions of incompetence onto the engineers who lived under those constraints.
(Hmmm ... perhaps I should have read this person's profile before commenting.)
P.S. He absolutely did attack the competence of past engineers. And "questioning" backwards compatibility with ASCII is even worse ... there was no point in time when a conversion would not have been an impossible barrier.
And the performance claims are absurd, e.g.,
"A simple and extremely common int->hex string conversion takes twice as many instructions as it would if ASCII was optimized for computability."
WHICH conversion, uppercase hex or lowercase hex? You can't have both. And it's ridiculous to think that the character set encoding should have been optimized for either one or that it would have made a measurable net difference if it had been. And instruction counts don't determine speed on modern hardware. And if this were such a big deal, the conversion could be microcoded. But it's not--there's no critical path with significant amounts of binary to ASCII hex conversion.
"There are also inconsistencies like front and back braces/(angle)brackets/parens not being convertible like the alphabet is."
That is not a usable conversion. Anyone who has actually written parsers knows that the encodings of these characters is not relevant ... nothing would have been saved in parsing "loops". Notably, programming language parsers consume tokens produced by the lexer, and the lexer processes each punctuation character separately. Anything that could be gained by grouping punctuation encodings can be done via the lexer's mapping from ASCII to token values. (I have actually done this to reduce the size of bit masks that determine whether any member of a set of tokens has been encountered. I've even, in my weaker moments, hacked the encodings so that <>, {}, [], and () are paired--but this is pointless premature optimization.)
Show me a quote. Where did I attack the competence of past engineers. Quote it for me or please just stop lying. I never attacked anyone. I even (somewhat obliquely) referred to several reasons they may have had to make decisions that confound me. Are you mad that I think backwards compatibility is a poor decision? That's not an attack against any engineers, it's just a matter of opinion. Your weird passive-aggressive behavior is just baffling here.
Here is a quote: "that seemed like it made sense at some point (or maybe changing case was super important to a lot of workloads or something, making a compelling reason to fuck over the future in favor of optimisation now))?"
You used "that seemed like it made sense" when you could have written "that made sense." The additional "seemed like" implies the past engineers were unable to see something they should have.
You used "fuck over the future in favor of optimisation now" implying the engineers were overly short-sighted or used poor judgement when balancing the diverse needs of an interchange code.
Hindsight is 20/20. Something that seemed like a good decision at the time may have been a good decision for the time, but not necessarily a great decision half a century later. That has nothing to do with engineering competency, only fortune telling competency.
I get that people here don't like profanity, but I don't see any slight in describing engineering decisions like optimizing for common workloads today over hypothetical loads tomorrow as 'fucking over the future'. Slightly hyperbolic, sure, but it's one of the most common decisions made in designing systems, and commonly causes lots of issues down the line. I don't see where saying something is a mistake that looks obvious in retrospect is a slight. Most things look obvious in tetrospect.
Again, "seemed like it made sense" expresses doubt, in the way that "it seems safe" expressed doubt that it actually is safe.
If you really meant your comment now, there was no reason to add "seemed like it" in your earlier text.
> I don't see any slight
You can see things however you want. The trick is to make others understand the difference between what you say and that utterances of an ignorant blowhard, "full of sound and fury, signifying nothing."
You don't seem to understand the historical context, your issues don't make sense, your improvement seem pointless at best, and you have very firm and hyperbolic viewpoints. That does not come across as 20/20 hindsight.
P.S I'm not the one lying here. Not only are there lies, strawmen, and all sorts of projection, but my substantive points are ignored.
"some backwards compatibility idiocy that seemed like it made sense at some point"
Is obviously attack on their judgment.
"a compelling reason to fuck over the future in favor of optimisation now"
Talk about passive-aggressive! Of course the person who wrote this does not think that there was any such "compelling reason", which leaves us with the extremely hostile accusation.
And as I've noted, the arguments that these decisions were idiotic or effed over the future are simply incorrect.
What is your preferred system? How does it affect other needs, like collation, or testing if something is upper-case vs. lower-case, or ease of supporting case-insensitivity?
In the following it goes from 2 assembly instructions to three:
int is_letter(char c) {
c |= 0x20; // normalize to lowercase
return ('a' <= c) && (c <= 'z');
}
Yes, that's 50% more assembly, to add a single bit-wise or, when testing a single character.
But, seriously, when is this useful? English words include an apostrophe, names like the English author Brontë use diacritics, and æ is still (rarely) used, like in the "Endowed Chair for Orthopædic Investigation" at https://orthop.washington.edu/research/ourlabs/collagen/peop... .
And when testing multiple characters at a time, there are clever optimizations like those used in UlongToHexString. SIMD within a register (SWAR) is quite powerful, eg, 8 characters could be or'ed at once in 64 bits, and of course the CPU can do a lot of work to pipeline things, so 50% more single-clock-tick instructions does not mean %50 more work.
> like front and back braces/(angle)brackets/parens not being convertible
I have never needed that operation. Why do you need it?
Usually when I find a "(" I know I need a ")", and if I also allow a "[" then I need an if-statement anyway since A(8) and A[8] are different things, and both paths implicitly know what to expect.
> and saved a few instructions in common parsing loops.
Parsing needs to know what specific character comes next, and they are very rarely limited to only those characters. The ones I've looked use a DFA, eg, via a switch statement or lookup table.
I can't figure out what advantage there is to that ordering, that is, I can't see why there would be any overall savings.
Especially in a language like C++ with > and >> and >>= and A<B<int>> and -> where only some of them are balanced.
Considerably stranger in regard to contiguity was EBCDIC, but it too made sense in terms of its technological requirements, which centered around Hollerith punch cards. https://en.wikipedia.org/wiki/EBCDIC
There are numerous other examples where a lack of knowledge of the technological landscape of the past leads some people to project unwarranted assumptions of incompetence onto the engineers who lived under those constraints.
(Hmmm ... perhaps I should have read this person's profile before commenting.)