Hi, I requested reddit for my data and I got 16Mb of CSVs… which is a considerable amount. Do anyone know of any tool to process / visualize / search … the data. I asume the format is the same for everyone, so maybe someone has already built something like that.

EDIT: the problem is not performance, with files <5Mb I can search with notepad++ in miliseconds. What I’m looking for is a user friendly interface (ideally with thumbnail images, links and such).

The problem with searching for “reddit export data visualizer” is that Google shows posts from reddit about visualization of generic data.

Thanks.

  • slazer2au@lemmy.world
    link
    fedilink
    arrow-up
    2
    ·
    10 months ago

    Firstly, can you even open the Csvs? If you can then Power Bi Desktop by Microsoft is the emerging goto for data visualisation

    • CrulOP
      link
      fedilink
      arrow-up
      6
      ·
      edit-2
      10 months ago

      Yes, no problem reading the CSVs, sorry if that wasn’t clear.

      I was looking for something more specific. Ideally something like a local web app that renders the posts, comments,… in a webpage with thumbnails and links to reddit elements.

      But that’s probably asking too much :).
      Thanks for the suggestion!

      • 𝒍𝒆𝒎𝒂𝒏𝒏@lemmy.one
        link
        fedilink
        arrow-up
        2
        ·
        10 months ago

        If you find one, let me know pretty please…

        I found a UI for my Hangouts data a while back, occasionally skim through those old chats once in a while. It’s nice to have a tool that visualises data request files in a user friendly way

        • CrulOP
          link
          fedilink
          arrow-up
          3
          ·
          edit-2
          10 months ago

          I’m searching on github different CSV filenames and I found a couple of projects that may be relevant:

          EDIT: This one also looks interesting:

          I’m still trying to figure out how to use them.

      • CrulOP
        link
        fedilink
        arrow-up
        2
        ·
        10 months ago

        The links you posted are weird:

        • https://pixeldrain.com/u/KfgV7bqn: It offers to download a file with the name Antimutt in r-Excel ultra.paq8o which I have no idea what is for.

        • https://the-eye.eu/redarcs: It says “This Reddit Community Has Been Archived”

        • Antimutt@lemmy.world
          link
          fedilink
          arrow-up
          1
          ·
          10 months ago

          The first is the result when I extracted all lines with my nick in them from the csv, stored with the best compression around. The second is where to get the csv - and a lot of communities have been archived there, like it says.

          • CrulOP
            link
            fedilink
            arrow-up
            2
            ·
            10 months ago

            Just to confirm I understand: you are talking about Power Query VS Power Bi for dealing with huge datasets, right?

            Because, in my case, with 16Mb, I don’t see the need for anything specially powerful. My problem is not performace, but convenience.

            Thanks for the input.