• dwheeler 11 days ago

    I understood the problem, but I found the page's explanation a little confusing at first. In particular, "lexical differential highlighting" misled me, because the word "differential" made me think that his algorithm was comparing lines or tokens in some way, and it doesn't do that.

    Basically, this algorithm tokenizes the source code, and tries to color each token so that identical tokens have the same color, but similar-looking tokens have very different colors. When tokenizing it specially handles comments and quoted text.

    That's an interesting approach to countering errors from "it's almost the same but I didn't notice they were different". I wonder - if I were trying to review source code that were malicious, maybe I could vary the color algorithm using a random source so that the source code writer couldn't make different tokens look similar in color. That might be an interesting countermeasure to some kinds of underhanded code.

  • saagarjha 10 days ago

    Yeah, I thought this would do something like highlight all "mov" derivatives the same way and was somewhat surprised at the brevity of the code at the bottom…

  • shipof123 10 days ago

    That reminds me of something I read in applied cryptography when I was young about how one could theoretically pass messages with “ \b” to generate infinite versions of “identical” text to cause collisions

  • kazinator 11 days ago

    This idea is related to "rainbow parentheses" (e.g. for Lisp): different levels of parens just get arbitrary different colors. But matching parens are the same color, just like two occurrences of %ecx in the same line are the same.

  • andrepd 11 days ago

    It's legitimately one of the best features of Excel. Does anybody know how I can achieve that in Sublime? The few options I found were subpar.

  • kaibee 10 days ago

    Don't know about Sublime, but there's a plugin that does this for Visual Studio.


    Probably not helpful to you, but maybe some other lurker.

  • neotek 10 days ago

    I'd love to know the answer to this as well, it would be so useful.

  • human_banana 10 days ago

    In emacs there's a package rainbow-delimiters-mode for parantheses, braces, brackets and what not, and rainbow-identifier-mode which makes variables names unique colors.

  • fake-name 11 days ago

    There's a sublime text package that does this for a bunch of different languages: https://github.com/vprimachenko/Sublime-Colorcoder

    I'm not involved in any way, I just ran it for a while at one point.

  • synthc 11 days ago

    There is also an emacs package that does something similar: https://github.com/jacksonrayhamilton/context-coloring

  • synthc 11 days ago

    I think DrRacket also has something like this, but it shows lines between identical variables instead of using colors.

  • xvilka 10 days ago

    Seems dead for many years already.

  • cjs_2 10 days ago

    How many updates per month are you expecting for a package like this?

  • xvilka 10 days ago

    Multiple times a day, like radare2. Seriously, if there is no activity in 6 months - then the project is dead.

  • mikekchar 10 days ago

    This is a lexical highlighter that tries to highlight similar, but different text differently. There's a point in time where there are no new features necessary.

    radare2 is a portable reversing framework. I can't think of 2 projects more dissimilar. Perhaps you were thinking that the highlighter actually did something other than color text in an arbitrary way? Can you give an example of something that you would expect to change about it, especially at the rate of multiple times a day?

  • guessmyname 10 days ago

    > There's a sublime text package that does this for a bunch of different languages

    You don’t need a package for this, Sublime Text 3 already does this automatically [1].

    [1] https://www.sublimetext.com/docs/3/color_schemes.html#hashed...

  • nh2 10 days ago

    How can I use it?

    The simplest way seems to be to use the "Celeste" color scheme which implements this. Is this the only way? I'd like to use a dark theme, like the default Monokai.

  • guessmyname 10 days ago

    Yes, “Celeste” is the only theme with support for semantic highlighting.

    For dark mode, I use this project — https://github.com/cixtor/monnokay

  • fake-name 10 days ago

    Well, neat!

    I haven't used the plugin since the ST2 days, so I didn't realize it was no longer needed.

  • soulofmischief 10 days ago

    Webstorm has an option for this and it makes things like dense enclosures or JSON actually parsable.

  • galaxyLogic 10 days ago

    Which feature is that? I've been using WebStorm for some time and wishing for a feature that would highlight all matching parenthesis (), [] and {}.

  • _virtu 10 days ago

    - plugin: rainbow brackets

    - preference: semantic highlighting

  • galaxyLogic 10 days ago

    Thanks. I tried it but it did not quite do what I needed so I uninstalled it. (I'm afraid of plugins in general taking performace away). It worked on JS-files but I have HTML-documents containing (example) JavaScript etc. code. Seems it did not react to parenthesis in them. Also even in plain JS-files you may have strings containing parenthesis.

    Standard WebStorm already highlights matching parenthesis in JavaScript and does a good job at that.

  • soulofmischief 9 days ago

    I don't use rainbow brackets, but I do use semantic highlighting. It's worth seeing if semantic highlighting would still be useful to you. It greatly helps scanning speed.

  • cylon13 10 days ago

    What made you decide to stop using it?

  • zokier 10 days ago

    Complete tangent but one thing that I've wondered about modernish asm mnemonics is how complex they are, and especially how much type information they encode in a semi-structured way. Taking the authors example of PMULHUW, the core operation is MUL(tiply), P for packed integers, H for high result, U for unsigned, and W for word sized (16 bit). I feel like there must be a better way to express the same thing that wouldn't lead stuff looking like one word all caps alphabet soup. I don't know exactly what that would be, spelling out everything would probably make assembly way too verbose. So some sort of middle ground would be nice.

  • chc4 10 days ago

    > I feel like there must be a better way to express the same thing that wouldn't lead stuff looking like one word all caps alphabet soup.

    Yes, that's called a programming language :^)

    Assembly is usually essentially a macro engine over the actual instructions you are emitting for your processor, and the Intel x86 chip manuals or whatever you're targeting use the outrageously long proper names, so your assembly will too. Heck, the author mentions specifically reading assembly too, so knowing what you're reading is 1:1 with the actual instruction stream is helpful, no matter how bad the official names are.

    Actual programming languages just abstract away some complex instructions like SSE vectorizing (which have famously terrible names) to some high-level API and intrinsic functions. And you should too.

  • zokier 10 days ago

    > the Intel x86 chip manuals or whatever you're targeting use the outrageously long proper names, so your assembly will too.

    I don't see why that has to be the case; why I'd must use Intel specified mnemonics instead of my own syntax? While not as radical, the att vs intel syntax demonstrates that the vendor syntax is not the only option. As long as the syntax captures all the details of instructions to be completely unambiguous then it should be perfectly interchangeable.

    I specifically do not desire higher level of abstraction because I want to maintain that 1:1 relation with the actual machine code. Heck, even Intel mnemonics do not truly have 1:1 relation to machine code, because the instruction (encoding) can depend on operand types.

  • breck 10 days ago

    I’ve done some experiments with tree languages that compile to ASM. I think it’s definitely the way forward.

  • okaleniuk 10 days ago

    Actually, it would be interesting to experiment with coloring all the abbreviations separately. P, then MUL, then H, then U, then W (or UW altogether). Not sure if it works, but it's something worth trying.

  • gpspake 10 days ago

    I remember Doug Crockford mentioning the idea of scope based highlighting for JavaScript in a workshop years back and thinking it would be useful. Cool to see it pop back up here.

    Edit: Here's a scope based js highlighting repo that cites Crockford as the inspiration but unfortunately he posted the linked description on Google+ so... uh... oops https://github.com/azz/vscode-levels

  • lifthrasiir 10 days ago

    [1] was a similar idea where color is determined by the prefix, so for example `currentIndex` and `randomIndex` are distinguished from each other but `currentIndex` and `currentIdx` are not.

    I'm not sure about both because, i) there are only a handful number of mutually distinguishable colors ([1] does mention the same complication), ii) we often want to highlight both the similarity and difference among identifiers and the cutoff is not clear. For i) we may want to leverage more formattings; for ii) I really don't have a good solution.

    [1] https://medium.com/@evnbr/coding-in-color-3a6db2743a1e

  • css 10 days ago

    Wow, this actually looks amazing for math (though it seems to be stripping out a lot of the code I pasted in): https://i.imgur.com/Iur9FgK.png

    How difficult would it be to implement this as a VSCode extension?

  • petschge 10 days ago

    This looks pretty good, but notice how it does not split "log(difference_squared" into two tokens. Adding '(' and ')' as delimiters should fix that.

  • css 10 days ago

    Good point. That helps, but it still strips about half of the lines of my code out for some reason. Specifically, this part: https://i.imgur.com/L117fYm.png

  • BenFrantzDale 10 days ago

    I love that visually I can find usages of, day, `alpha`.

    I do wish it did some syntax highlighting, but one could easily imagine blending between this and conventional syntax highlighting.

  • panopticon 10 days ago

    Tangential, but "Just as every other piece of code on Words and Buttons, it's properly unlicensed." reads like the code is literally unlicensed and not using the Unlicense license.

    It's a little weird to me because unlicensed code is very different than the Unlicense license.

  • ChrisSD 10 days ago

    And I'd add that CC0 is more "properly unlicensed" than Unlicensed is. Or at least more thoroughly so.

  • canadaduane 11 days ago

    I think this is also called semantic coloring. Visual Studio Code has it on the roadmap to try this year: https://github.com/Microsoft/vscode/wiki/Roadmap#editor

  • sixplusone 11 days ago

    No, semantic coloring is about the editor having deep knowledge about your code, this is about having very similar looking names or lexemes appear different. FTA:

    It's fine that mov doesn't look like eax, but I'd rather prefer pmulhw and pmulhuw to be shown as differently as possible.

  • jcelerier 11 days ago

    KDevelop has pioneered this a decade ago : https://zwabel.wordpress.com/2009/01/08/c-ide-evolution-from...

  • gmueckl 10 days ago

    Ecliose also has had this for ages at this point. I don't remember when they introduced it, but when you can memorize the meanings of all the colors, it's great.

  • m0zg 10 days ago

    I'm not a fan of this approach in general, but I am a fan of highlighting instructions from different subsets in different colors in asm, and perhaps differentiating the saturation by latency/throughput. I.e. a "heavy" instruction should probably be bright, urgent red, whereas loads, stores, adds, bit ops should probably be more muted.

  • IshKebab 10 days ago

    Something like this is implemented in vscode-clangd. I used it for a bit but it's just too colourful. There are just colours everywhere and it's overwhelming. I went back to normal syntax highlighting.

  • KuhlMensch 10 days ago

    Curious. I mean it sounds like relying simply on contrast rather than the structure. I know our visual system is insane at contrast, and we, as humans tend to group tokens as a shorthand.

    What mades me immediately pause, is when I reflect reading javascript: How often do I scan past 3+ lines using colour as my "bridge"? As far as I can remember, not often. Maybe I've overestimated colour-to-lead-me-through-structure. Maybe it is often, colour-to-give-me-token-rhythm. Curious.

    I'll have to remember to load up CSS or a test suite (with lots of framework calls) using this approach.

  • SilkySailor 10 days ago

    I really like this idea. I always wanted to try to take this to insane levels. For example, for large code bases have different images associated with different modules. So that your brain has more things to latch on to. e.g.: This function from the banana module is calling the teddy bear module. It seems a bit absurd since there is no correlation between the image and the module functionality but I still want to try it.

  • stochastimus 11 days ago

    This is really cool. It kinda looks like rainbow salad, but who cares? For me at least, it is much easier to visually parse.

  • DarmokJalad1701 11 days ago

    Nice to see some MASM32 code in there in one of the examples. That's from a WIN32 app if I am not wrong.

    Brings back memories.

  • FrancisNarwhal 10 days ago

    Oh my god this would have saved my bacon two days ago. p_value_default is so visually similar to v_value_default that after sitting there with another developer trying to figure out the problem for 30 mins we rewrote the whole method.

    Only the next day after the deadline pressure was gone did I spot the problem.

  • Avamander 10 days ago

    I understand it in the case of assembly, but I don't think it'd work for something like Python better than existing syntax highlighting. So it's nice and I hope things like Radare or IDA adopt it where people even intentionally make syntax highlighting nearly impossible.

  • ggm 10 days ago

    I encourage the original author to find a way to talk about assembly coding in the nuclear industry.

  • gcbw2 10 days ago

    what do you expect to be different from your run-of-the-mill maintenance of outdated industrial automation gig?

  • YeGoblynQueenne 10 days ago

    At a guess, an increased probability of causing a criticality accident as a result of getting a program slightly wrong.

  • exDM69 10 days ago

    I'm assuming the "reading assembly" part is verifying compiler output matches what the programmer thinks and signing it off as a "blessed binary".

    Some safety critical areas of software are done this way, in aerospace for example. But run-of-the-mill automation jobs aren't.

  • ggm 10 days ago

    bit flips from surplus neutrons? TMR? Batshit crazy lack of process checks on 'what does this button do'

    war stories.

    actually, I encourage anyone in coding to share run-of-the-mill maintenance of outdated industrial automation, as a gig. I'd read that blog.

  • pcwalton 10 days ago

    In this particular case, the highlighting is a clever workaround for the fact that x86 register naming conventions are awful. RISC architectures tend to number the registers, which makes things significantly easier to read.

  • m463 11 days ago

    Not code, but I'm surprised that email clients don't have better colorization from the getgo.

    I think it would be the single best thing to help a huge amount of people.

  • 10 days ago
  • gnuvince 10 days ago

    There are too many colors in too many places. Everything is highlighted and nothing stands out.

  • galaxyLogic 10 days ago

    I agree. Rather than rainbow the brackets I think a better solution is to highlight the matching brackets with a temporarily different color as user moves the cursor.

    Or at least make it easy to turn the rainbows on and off.

  • Insanity 10 days ago

    which forces you to read everything individually and not miss something. I prefer less highlighting for this reason. I highlight a few keywords but other than that I don't highlight. I find it helps me _read_ the code rather than skim the code. (and for skimming, I'd grep through it most likely looking for something specific rather than trying to understand it.)

  • Analemma_ 11 days ago

    > In 2013 I was working in nuclear power plant automation ... the job required reading a lot of assembly code.

    Does anyone else find this terrifying? Nuclear power plant automation should be done in the safest of the safe languages. I would be alarmed at the thought of stuff like this being written in C, never mind in assembly!

  • holy_city 11 days ago

    Not really. There are plenty of chips out there without even a C compiler. Some don't even support Turing Completeness. There's even more that were designed and installed before manufacturers started slapping C compilers together for their DSPs, FPGAs, and MCUs.

    It would be weird to care about memory safety when your board doesn't even have a heap!

  • ARandomerDude 11 days ago

    To me, it's less terrifying than a complete rewrite in a modern language. Modern languages are great. Rewrites are often littered with bugs.

  • pvg 11 days ago

    Systems like that tend to be designed with different kinds of safeties. A mildly silly example - your typical Rails app doesn't have a watchdog timer, your toaster probably does.

  • okaleniuk 10 days ago

    An excellent example!

  • sixplusone 11 days ago

    Yes he said reading assembly, not writing. Whatever they use, I'm glad that someone's having a glance at what the compiler spits out. Also could be talking about microcontrollers, and in an industrial setting PLCs wouldn't be unexpected.

  • 10 days ago
  • splittingTimes 11 days ago

    Does something like this exist for Java eclipse?