• pizlonator 9 months ago

    > As of 2012, all modern browsers ship a mark-and-sweep garbage-collector. All improvements made in the field of JavaScript garbage collection (generational/incremental/concurrent/parallel garbage collection) over the last years are implementation improvements of this algorithm (mark-and-sweep), but not improvements over the garbage collection algorithm itself, nor its goal of deciding whether an object is reachable or not.

    WebKit uses a constraint-based garbage collector that does not rely on reachability alone. This is an improvement over the classical garbage collection algorithm.

  • randomguy1254 9 months ago

    Nice article. Minor gripe about the static vs dynamic memory section. The requirement that the data sizes be known at compile-time for static memory (with the example of an array allocated to a user-inputted size), seems to be based on a past restriction of the C language. C has since remove the requirement that stack-based arrays are sized with a compile-time constant; there is nothing at the hardware/assembly level which prevents such arrays. So these stack-based non-compile-time-sized arrays don't fit into either the static or dynamic memory categories presented here.

  • phyllostachys 9 months ago

    I was bothered by saying that static memory is assigned on the stack (implied that it is only on the stack). As an embedded guy, at least in bare metal situations, local function statics, any const variables, and any globals are either in the read-only data section[1] or in the general data section[2], both before the heap and definitely not in the stack section.

    [1] - .rodata section in gcc

    [2] - I think this is in the .data and .bss sections, where .data is copied from the file and .bss is zero initialized before calling main by the crt. If you peak into a linker script, or dump one by passing --verbose to ld, you can even see where it puts all the C++ bits and pieces. A dump I did on Debian with g++ is here: https://gist.github.com/Phyllostachys/0682a3bda13ef9c6b49d04...

  • phyllostachys 9 months ago

    After looking at that default linker script dump, I noticed it didn't have the stuff I saw in the ARM linker script that I'm used to. So I attached one for a Silicon Labs EFM32 part. It's below the x86-64 linker script and is a little easier to read.

  • phyllostachys 9 months ago

    After a little investigation[1][2], it seems things get weird with an OS, as would be expected with virtual memory, et al.

    [1] - https://stackoverflow.com/questions/16360620/find-out-whethe...

    [2] - http://duartes.org/gustavo/blog/post/anatomy-of-a-program-in...

  • kahlonel 9 months ago

    Except for the fact that the arrays you are talking about are still put on the stack.

  • randomguy1254 9 months ago

    Apologies if unclear, that is what I meant be stack-based. Article implies that you cannot create a dynamically sized object on the stack.

  • vvanders 9 months ago

    Wait, is #3 for real? If that's the case it seems like a huge oversight.

  • IgorPartola 9 months ago

    Yeah, of the four things listed, I think it's the one that would trip me up the most. I think the safety net here would be dead code elimination: `unused()` should be detected as never called and removed during compilation/transpilation so that this wouldn't be an issue.

    I am sort of surprised that `unused` is not picked up by the garbage collector in the first place though. Since JS functions are objects, shouldn't it detect an object that's not referenced during the mark and sweep?

    In general, I really hate having to debug memory leaks in JS or Python. The interpreter for both will randomly allocate additional memory as it runs, so using tools like Valgrind is next to impossible. The only reliable method I've found is to pepper my code with logging statements that show what the current memory usage is, run the code like 1000-10,000 times, and see the points between which the memory usage goes up without coming down on a consistent basis. Python's built in `gc` module seems nearly useless for determining what's actually stuck in memory, and having a billion libraries that can have their own memory leaks is also not fun. These are the times I miss C: when you leak memory in C you know it because it becomes painful fast and it's usually easy to find, if your code is sane.

  • wahern 9 months ago

    The fix is to implement the closures properly, by only closing over individual variable slots. It looks like the engines are implementing closures by closing over entire windows of slots--that is, if two functions have the same scope, they inherit the _union_ of the variables they reference as a single window/block of variable slots.

    The original article has a much simpler explanation and solution: https://blog.meteor.com/an-interesting-kind-of-javascript-me...

    To learn more about closures than you ever thought possible, try reading this paper describing how closures are implemented in Lua: http://www.cs.tufts.edu/~nr/cs257/archive/roberto-ierusalims...

  • favorited 9 months ago

    I haven't done JS in a while, but it sounds to me like it is referenced, just not in source code. `someMethod` references it implicitly.

    Is my understanding correct?

  • gsnedders 9 months ago

    Yes. The function's frame contains a reference to the object storing local variables of the parent frame. To do better you need to store a list of variables referenced (which you can only do if there's neither direct-eval nor a with statement).

  • Ajedi32 9 months ago

    > In general, I really hate having to debug memory leaks in JS or Python. The interpreter for both will randomly allocate additional memory as it runs, so using tools like Valgrind is next to impossible.

    I don't know about Python, but for JavaScript can't you use the built-in devtools? Maybe try grabbing a heap snapshot, or recording a record allocation profile and go from there?

  • paulddraper 9 months ago

    Yeah. I thought so too (and still think so).

    SO post: https://stackoverflow.com/questions/19798803/how-javascript-...

    Chrome bug report: http://crbug.com/315190

    Meteor blog (linked in article): https://blog.meteor.com/an-interesting-kind-of-javascript-me...

    Live example (will crash due to memory leak): https://s3.amazonaws.com/chromebugs/memory.html


    The reason this exists in all JS engines is for performance; it's easier to have on context record instead of several.

    Other languages do not do this. Off the top of my head: Lua, Java, Scala

  • vvanders 9 months ago

    Yeah, there's a nice link in the comments on the chrome bug on how Lua does it with upvalues: https://bugs.chromium.org/p/chromium/issues/detail?id=315190...

  • dingo_bat 9 months ago

    > Live example (will crash due to memory leak): https://s3.amazonaws.com/chromebugs/memory.html

    Happy to report that no crash in Nightly.

    Edit: No crash in Edge either.

  • paulddraper 9 months ago

    It'll crash Chrome because it puts stricter limits on JS memory. (Or something.)

    Firefox and Edge won't crash, but you'll be using 3GB+.

  • dingo_bat 9 months ago

    Yup Firefox was at 3.6GB and edge at 2.8GB.

  • maxxxxx 9 months ago

    Would it be better to have to explicitly declare what variables you want to import into the closure like PHP or C++ do? C# also captures everything by default and reference which has tripped me up quite a few times.

  • jhgb 9 months ago

    That might help but it seems like either an implementation or language spec fix is in order. There doesn't seem to be a reason for a function without free variables to turn into a closure at all, thus preventing the issue.

  • paulddraper 9 months ago


    Currently, the ECMAScript specification says nothing about GC.

    And it seems every major JS engine has decided that this type of memory leak is okay.

    So it's rather unlikely something will change.

  • bzbarsky 9 months ago

    It's for real, in V8. Other JS implementations may not have the same problem.

  • Jach 9 months ago

    The particular memory issue is new to me (though now I can watch out for it, yay) but I'm not surprised... JS lacking proper lexical scope causes many issues.

  • barrkel 9 months ago

    Lexical scope is a semantics concern - observable behaviour that is required for correctness. The case under discussion is an implementation concern - there's no necessity for it to leak by creating a linked list of activation records as further analysis could break the chain. The two are not related.

  • chasd00 9 months ago

    when you say "proper lexical scope" do you mean just block vs function level scoping of variables? If so, i wouldn't say javascript is wrong it's just different.

  • fenwick67 9 months ago

    Who is this article written for? There's a whole section on "What is memory?". If you are optimizing to remove memory leaks I really hope you already know what memory is.

  • vijaybritto 9 months ago

    It was helpful for me. I don't have a CS background and I have been coding javascript for around 4 years now. There are a ton of other devs who don't have a proper knowledge of basics. If you are well versed then you can skip over to the next section.

  • mhh__ 9 months ago

    In my experience there are (I don't know how many) some programmers who, given how they were taught/learnt, can do productive work but generally don't know computing/programming in the abstract e.g. Memory at the machine level, or type systems.

  • userbinator 9 months ago

    can do productive work but generally don't know computing/programming in the abstract

    e.g. Memory at the machine level,

    I think it's the other way around --- their usual level of abstraction is too high to understand such things...

    or type systems

    ...and slightly too low to understand others.

  • skullum 9 months ago

    if you program you have some concept of what memory is. I found the review of basic terms helpful in understanding the common JS leaks. YMMV

  • SadWebDeveloper 9 months ago

    m with you pal, memory leaks are the last topic you touch when optimizing a function, usually you start lowering the execution time, followed by I/O blocking issues then proceed to server-related issues, network latency and after all that is covered you start looking for "memory leaks" so explaining memory is usually unnecessary at this point, since usually JR's devs are more focused on producing software rather than optimizing it.

    IMHO memory leaks are important only on embedded software because you start there with a really low memory available for your software to run.

  • lhnz 9 months ago

    It's also very important when building long-running applications (e.g. electron applications).

  • styfle 9 months ago

    Jump straight to "The four types of common JavaScript leaks" section:


  • dualogy 9 months ago
  • styfle 9 months ago

    Oops, it looks like medium removes the hash on load so my copy/paste didn't work. I fixed my link.

  • cel1ne 9 months ago

    The main problem with JS being sluggish (in electron apps for example) is memory and especially the GC.

    I can optimize CPU usage all I want, but only after I optimized for minimum allocations, the tiny, but noticeable lags now and then would disappear.

    The average javascript-GC must be really simple/naive compared to seasoned workhorses like the JVM's various GCs.

    There I can happily create millions of short-lived objects before getting problems in a single-user application.

  • chillacy 9 months ago

    Well, you can run JS on the JVM through Oracle's Rhino (now Nashorn), but apparently perf is still largely worse than Sunspider or V8. The language doesn't lend itself to optimization as much as java does for JVM bytecode: https://blogs.oracle.com/nashorn/nashorn-architecture-and-pe...

  • unkown-unknowns 9 months ago

    Probably part of the problem is also the fact that JavaScript is a very dynamic language.

    I think even the JVM team would struggle to improve on the state of the art in js vm tech. Their experience in making JVM might not be all that useful in the context of js.

  • 9 months ago
  • irtefa 9 months ago

    Haven't read such an easy to read technical article in a while. Kudos!

  • dispo001 9 months ago

    having stuff do stuff for you is useful until it doesnt

  • haburka 9 months ago

    I'm really glad that I use a garbage collected language. Unless I'm doing low level programming that requires controlling allocation and freeing, it's amazing. Yes, I still have to understand the basics of memory but I'm just very glad that most of the time, the basics are far more than enough.

  • theprotocol 9 months ago

    I agree. I strongly dislike "magic" in programming. I prefer to call it "denial" because you need to know about the complexities anyway, and "magic" often means sweeping them all under the rug.

  • prophesi 9 months ago

    Exactly, you need to de-mystify the garbage collector so that it does what you want it to do.

  • abritinthebay 9 months ago

    At some point every abstraction above Assemby meets that criteria though. The lines are personal and mostly arbitrary.

  • taeric 9 months ago

    Meh, if you try and split hairs, even assembly has magic in it nowdays. Not all instructions take the same amount of time. Some flush caches, thus cause unexpected memory behavior, etc.

    However, I think it is fair that most people learn roughly what the side effects are of each line at a local level.

    Ironically, this is an argument against many functional languages. There are not side effects of the logic, per se. However, there are massive implementation side effects that are not necessarily easy to reason on.

    The saving grace for the vast majority of people is that typically you can get by without knowing all of this. The people that care, do care. But statistically you are not one of them. :)

  • abritinthebay 9 months ago

    Like I said - it's mostly arbitrary. ;)

    That said I think the issues with assembly you mention aren't magic as such, they're just consequences of the commands. They don't really hide much (if anything) behind the scenes that you'd have access to anyhow.

    It's just that CPUs do so much more than they used to.

  • horsawlarway 9 months ago

    It's not that simple though. Most modern CPUs that support the x86_64 instruction set don't actually run them as instructions on the hardware. They do all sorts of magic to queue operations, increase pipeline throughput, manage register access, make branch predictions, etc...

    You can think of assembly on those cpus as a high level language. It has little correlation with what's actually happening in hardware.

    This is EXACTLY the same type of "magic" that is getting complained about above. The real implementation details are hidden and unknown, but the abstraction is useful.

  • abritinthebay 9 months ago

    Ah interesting. Not familiar with x_64 really. Mostly 16 & 32 bit experience here.

  • taeric 9 months ago

    I worded my post poorly. The "However, I think it's fair" was me agreeing with you. Pretty much completely.

    I was just musing on how the arbitrary line is probably not as difficult to see as many other lines we have out there. I think this would fall into "systems languages" and related things.

  • theprotocol 9 months ago

    True, in this context the side effects are not intentionally hidden under pretext that "it works automagically."

  • theprotocol 9 months ago

    It's certainly subjective, but for me, what I pejoratively refer to as "magic" is things that there's just no escaping knowing, yet are abstracted in a way that obfuscates what's going on. Often, it's presented as "it just works" which ends up being a hindrance since there's just no getting around the thing that it's hiding for you.

  • def0wt 9 months ago

    > To prevent these mistakes from happening, add 'use strict'; at the beginning of your JavaScript files. This enables a stricter mode of parsing JavaScript that prevents accidental global variables.

    I don't think using strict will prevent accidental global variables, such as this.var in global scoped function calls. Strictness main goal is to prevent inadvertently misspelled variables from going unnoticed.

  • ufo 9 months ago

    Iirc, if you use strict the this gets set to null by default instead of the global object

  • jnordwick 9 months ago

    More extremely junior posts being rated to the front page. Y combinator is changing and i don't like its new junior tutorial level.