• charlieegan3 5 months ago

    This is a great idea for a project, the 'users who also posted' metric seems to have worked really well.

    The site seems to fail to load the 'hot' items for the subreddits when I click on them but that's not a big deal for me. On closer inspection, it doesn't seem to be making any requests. Just says `Failed to download https://www.reddit.com/r/thinkpad/hot.json` etc

  • aasasd 5 months ago

    > the 'users who also posted' metric

    — Hello, is this the anime channel?

    — Yes.

    — How do I patch KDE2 under FreeBSD?


  • Smithalicious 5 months ago

    The accuracy of this meme is stunning. I run an anime-related discord of 20-odd people and at least half of people there work in tech in some way. I've seen similar things in order such communities.

    I wonder if this is just a cultural artifact from the time that anime and technology were both "geeky" niche interests (to a greater extent than they are now) or if there's a deeper underlying reason...

  • vidarh 5 months ago

    It may be a stereotype, but to me it seems that in geek circles it is much more acceptable to admit to continuing to appreciate things often seen as "childish" elsewhere in general.

  • mrfusion 5 months ago

    Another metric could be looking at cross posts. I’m not sure which is better.

  • Breza 5 months ago

    That could be cool, but it would eliminate any subs that don't allow crossposting. That includes a few of the heavy hitters like ShowerThoughts and AskReddit.

  • anvaka 5 months ago

    hmmm... I don't see the error on my end. What browser do you use? Can you try in "incognito" mode? Are there any extensions that might be blocking this?

  • Sendotsh 5 months ago

    Doesn't work for me either, on Firefox Developer Edition 65.0b10 (64-bit) with no extensions enabled (disabled them all to double-check it wasn't one of them blocking it).

    Works fine in Edge.

    It's purely the loading of the Hot sidebar, everything else works fine. It has already helped me find a few new subs I didn't know existed, so thanks!

  • newman314 5 months ago

    Have you thought about building something similar for bot identification on Twitter? I suspect that would be quite the useful feature.

  • charlieegan3 5 months ago

    Yeah I'm on FF - can confirm my issue was related to the content blocking.

  • jsloss 5 months ago

    I'm getting the same error on Firefox Quantum.

  • anvaka 5 months ago

    Hm... I'm at lost.

    https://jsbin.com/fuyijan/2/edit?js,console - this works in Chrome, and non-private mode of Firefox Quantum (64.0.2 (64-bit)). However when I open private browsing in Firefox Quantum request fails.

    Anyone might know why?

  • EvilTerran 5 months ago

    That sounds like Content Blocking kicking in - that's only active in Private Browsing by default: https://support.mozilla.org/en-US/kb/content-blocking

    I note that page says "By default, content blocking uses the Disconnect.me basic protection list" - and reddit.com is on that list: https://github.com/disconnectme/disconnect-tracking-protecti...

    (I'm guessing reddit's "social button" is considered a tracker.)

    [edit] confirmed, it's definitely Content Blocking: I just loaded that jsbin in an FF private window, and there's a message in the console to that effect.

  • anvaka 5 months ago

    Thank you so much! I opened an issue here: https://github.com/disconnectme/disconnect-tracking-protecti...

  • renholder 5 months ago

    >(I'm guessing reddit's "social button" is considered a tracker.)

    It wouldn't surprise me. Even though /r/ has concepts like "Silver" and "Gold" to generate revenue, I think it's main driver is still advertising; so, for it to behave like Facebook, Google, etc. wouldn't be that much of a stretch of the imagination. (Or maybe I'm just far too paranoid?)

  • Smithalicious 5 months ago

    It's a cool tool, but it seems very biased towards bigger subs. If you let it loose on a small sub it will emphasize that big, kinda-but-not-really related subs over tiny-but-closely-related subs.

  • ppod 5 months ago

    Using Jaccard has this effect, mutual information would correct more for the independent frequency of the posts per subreddit.

  • Smithalicious 5 months ago

    It's a shame since this tool would be particularly useful for recommending small subs. I don't need it to tell me about big subs, since I already know them.

  • patcon 5 months ago

    This is seriously amazing man! Interesting to see how different subject-areas network themselves differently.

    For example, comparing "r/permaculture" to "r/linux".

    Also, looking at r/girlgamers makes me realize my privilege for being able to navigate my interest areas without such a clusterfuck of bullshit going on: https://anvaka.github.io/sayit/?query=girlgamers

  • swampthinker 5 months ago

    It's really sad how toxic Reddit brigading is

  • skilled 5 months ago

    This is awesome! My input had exactly the results I expected.

    Thanks for creating this tool, bookmarking!

  • anvaka 5 months ago

    Thank you! I'm very glad you liked it :)

  • viraptor 5 months ago

    I checked VXjunkies and found the level of weirdness I haven't expected. Will need a few hours to browse through this while nobody is around / can be startled by sudden, random laughter...

  • nairboon 5 months ago

    That's a cool tool. And useful extension would be if it preserves the location history if you navigate topics, so that you can go back.

  • anvaka 5 months ago

    Good call. I was worried that I'd "spam" the browser history and people who are coming from reddit or HN would never go back to where they came from :)

  • adrianmonk 5 months ago

    Usability improvement idea: make it easier to discover how to re-center the graph around a new subreddit.

    I spent several minutes playing around with this, and I was just typing in the name of the desired subreddit because that was the only I could figure out. Finally, after much experimenting, I realized double-clicking is the solution.

    Oh, and a second, related usability idea: if I double-click, don't open the preview sidebar at the right. I can see how the sidebar is useful, but if I'm doing one action, I don't want it to have two effects. Also, I have signaled clear intent to browse the graph, so I want more screen real estate to be devoted to that.

    EDIT: bonus usability idea/request: clicking on a node brings up the preview sidebar. It'd be nice if clicking on it again (not double-clicking) makes the sidebar hide again.

  • KasianFranks 5 months ago

    Anvaka, when you accept BTC or ETH let us know, we can contribute to your efforts.

  • anvaka 5 months ago

    Thank you, Kasian!

  • hueyjj 5 months ago

    > The relationship is determined by a metric "users who posted to this subreddit also post to...".

    I'm interested, could you share with us the the entire metric you used to determine the relationship?

  • anvaka 5 months ago
  • jcims 5 months ago

    Have you tried polling profiles to see how many are sharing upvotes/downvotes? It used to be a small percentage but is pretty informative.

  • minimaxir 5 months ago

    You indicated that you used the Pushshift.io datasets, but how did you compute Jaccard Similarity on a dataset of 38M?

  • anvaka 5 months ago

    I didn't use pushshift, sorry. The data was collected from bigquery, stored locally into CSV files, and then I just wrote a node.js script to compute similarities.

  • Scaevolus 5 months ago

    Did you simply collect "user has posted to X, Y, and Z subreddits", or did you look at frequency too?

  • minimaxir 5 months ago

    The reason I asked the question is because back in 2016 I had a similar (now out of date) approach to finding related subreddits at scale using Jaccard similarity: https://minimaxir.com/2016/06/reddit-related-subreddits/

    There, I only built a user edge if a given user commented on 5 distinct threads in a subreddit, since a lot of subreddit interaction was due to brigading.

  • anvaka 5 months ago

    I didn't look into frequency. Is there a version of jaccard similarity that accounts for frequencies?

  • scrollbar 5 months ago

    Check out Graphlab Create's recommender toolkit, pretty fast for sets of that size


  • Smithalicious 5 months ago

    +1 for this recommendation, but it's called turicreate now and can be found here: https://github.com/apple/turicreate

  • jotato 5 months ago

    *types in DunderMifflin

    related: MapsWithoutSouthSudan

    I know what I am going to be doing for the next 30 minutes

  • bibyte 5 months ago

    This is a really useful tool. It works so smoothly on my mobile.

  • anvaka 5 months ago

    Happy to hear :)!

  • Phenomenit 5 months ago


    I've been searching for a tool like this for ages, bookmarked!

  • anvaka 5 months ago

    Thanks :)

  • laurynas-s 5 months ago

    This is really nice!

  • anvaka 5 months ago

    Thanks :)!

  • techaddict009 5 months ago

    Good tool if possible add option to view result data in tabular format with no of subscribers. As this way its difficult to use.

  • DevX101 5 months ago

    Great tool! This site supports my suspicions that much of the activity on /r/The_Donald is the coordinated effort of a few individuals posting across multiple accounts. For those not familiar with this sub, it was created sometime during the 2016 election leadup and unabashedly supports Donald Trump with memes and shitposting. At one point, the entire frontpage of reddit was just posts from /r/The_Donald until reddit admins had to alter their algorithm to force the sub off.

    If you look at the network graph for /r/The_Donald, it doesn't look...organic. There are 4 clearly delineated clusters of sub related to that sub. Posters to /r/The_Donald heavily post to /r/news & /r/politics, /r/TropicalWeather (?), /r/TwoXChromosomes (?) and /r/AskTheDonald (and other alt-right subs).

    There's not much interaction with the rest of reddit. Posters from other subs don't also post content to the /r/The_Donald.

    This is unusual.

    Every other sub I've looked at there's a much more complex & dynamic graph where users post across various communities across the site. Every other major sub looks like a real network with dozens of interconnected links. Yet, /r/The_Donald, with almost 700,000 subscribers only has a strong connection to 4 clusters.

    The alternate hypothesis is that people on that sub heavily use alternate accounts. This might also explain the lack of interaction with the site compared to other subs of similar size.

  • zawerf 5 months ago
  • DevX101 5 months ago

    Thanks! That's probably it then. I guess this doesn't support my hypothesis after all.

  • bdibs 5 months ago

    This is great, and works flawlessly!

  • anvaka 5 months ago

    Thank you! I'm so happy you like it.

  • bdibs 5 months ago

    It’s simple and just works, don’t stop making great things.

  • anvaka 5 months ago

    Aww, thank you!

    > don’t stop making great things.

    Not going to ever stop! I have sooo many ideas - I wish I could be more efficient :).

  • cannedslime 5 months ago

    Useful little tool! Reddit humor subs are so damn specific, it can be hard to find them all.

  • mrfusion 5 months ago

    Is this only for tech subjects or am I using it wrong?

    Edit. Somehow I missed the big searchbar at the top.

  • cambaceres 5 months ago

    I tried "tits", that worked.

  • criddell 5 months ago


  • cambaceres 5 months ago

    There was some cocks present anyway

  • jamiek88 5 months ago

    Fantastic! I tested the heck out of this and found it really useful.

    Already found some cool subs.

  • belltaco 5 months ago

    You should submit this to r/dataisbeautiful if not already done.

  • ppod 5 months ago

    Which javascript network vis library does this use? It's very nice.

  • yanslookup 5 months ago

    I was sort of expecting to be able to click through to the subreddit...

  • benibraz 5 months ago

    Very nice tool, thank you very much that. This is why is love HN

  • kerbalspacepro 5 months ago

    Interesting finds:

    * /r/askscience is nested at the center of defaults (I think a lot of older, famous subs will end up highly connected)

    * /r/relationship_advice is kind of a loner. The graph generates six distinct subreddit clusters- feminism, lgbt-issues, counseling, and misc. science fields. The last cluster is a very large, diffuse cluster of sex/porn/depression subreddits that skew towards defaults.

    * /r/slatestarcodex has distinct clusters too. 1) Effective altruism and philosophy, 2) Psychiatry, 3) Rational fiction writing, 4)Liberal-tarian, IDW defaults, 5) "Classic effort post" subs like true_reddit and depth_hub.

    * /r/bigboye is a tiny part of a very large network of animal gifs subreddits. /r/animalsbeingbros connects it to a bunch of high volume gif subs.

  • newman314 5 months ago

    * /r/the_donald has a surprising link to /r/TwoXChromosomes [https://anvaka.github.io/sayit/?query=the_donald]

    * /r/politics seems to have higher interconnection

    * /r/awww is quite wholesome =) [https://anvaka.github.io/sayit/?query=Awww]

    * /r/puppers has some strange nsfw links

  • belltaco 5 months ago

    >/r/the_donald has a surprising link to /r/TwoXChromosomes

    I don't think it's surprising. Donald fans on social media tend to hate minorities and women, not surprised they would try to brigade women oriented subs.

    It got so bad that subs like /r/offmychest automatically ban people that post in many alt right related subreddits.

  • patcon 5 months ago

    The isolatedness of /r/relationship_advice might have to do with OP's being from throwaways?

  • chad_strategic 5 months ago

    This is great!

    But on a side note, I can also waste more time on the Internets!

  • diziet 5 months ago

    Is this built on top of your work on yasiv before?

  • anvaka 5 months ago

    It would be fair to say so. The core layout is the same with a bit more polished overlap removal and animation.

  • myself248 5 months ago

    Why do I get stuck in "dead ends"? For instance, https://anvaka.github.io/sayit/?query=rtlsdr contains https://anvaka.github.io/sayit/?query=PlutoSDR but the inverse is not true -- once I'm in PlutoSDR there's only one other subreddit and the two of them are an island.

  • andyidsinga 5 months ago

    damn - i wondering if this with marketing in order to find out where your audience hangs out.

  • sureaboutthis 5 months ago

    Ya' know this assumes one would use reddit as a reference for learning which one should never, EVER do, don't ya?

  • thro_a_way 5 months ago

    hi thanks for this. Is there a guide to how you are storing the data on github pages?

  • amunategui 5 months ago

    Great visualization! Nice work.

  • chx 5 months ago

    Incredibly useful, thanks!

  • diimdeep 5 months ago


  • flylib 5 months ago

    nice tool

  • yzb 5 months ago

    Would be nice if banned subs appeared in a different colour.

  • patcon 5 months ago

    If you have spacetime, you might consider sharing this with LGBTQ and kink communities experiencing the Tumblr diaspora.

    Context: https://nowtoronto.com/lifestyle/advice/savage-love-tumblr-p...

    Lots of people feel uprooted from sex-positive and/or tightly-bound communities they've been part of for years, and don't know how to rediscover or rebuild the healthy networks they've lost on Tumblr. I know full-grown adult women who are struggling to find footing again in the most personal of spaces.