Show HN: Pipet – CLI tool for scraping and extracting data online, with pipes

itsbrex · on Oct 2, 2024

Thus is great. Thanks for sharing. Sitting on a cache of one-off scripts as well. Looking forward to checking this out further.

yoavm · on Oct 2, 2024

Just as an example: if you have go installed on your laptop, you can extract all the comments from this post as JSON by creating a file with

  curl https://news.ycombinator.com/item?id=41695549
  .comment
    div > div

save it as comments.pipet and run `go run github.com/bjesus/pipet/cmd/pipet@latest --json comments.pipet`. Or run it with `--interval 60 --on-change "notify-send {} "` to periodically check for updates and call notify-send (on Linux) to get a notification when new a comment appears!

rotemtam · on Oct 2, 2024

Well done!

cynicalsecurity · on Oct 2, 2024

That's all nonsense. If it's not imitating a real user by launching a real browser on a GPU and moving the mouse cursor with AI like a real human would, you could throw this project down the trash. Also, the coding doesn't matter unless it's run from a network identified as regular residential apartments.

yoavm · on Oct 2, 2024

The idea is that you run this from your personal computer, so I don't see any issues with the network really? Regarding the AI emulation browser in GPU - do you have an example on such a website that you'd like to scrape? I'd love to see how Pipet can support that.

mariocesar · on Oct 2, 2024

Said by "cynicalsecurity" :D

moltar · on Oct 2, 2024

Because every website has such an advanced protection?