OzBargain - RSS Feed - Accessing The Deals Comments?

Hi Team,

I wanted to use the OzBargain RSS feed to access a large amount of comments on ozbargain (using python) to perform a sentiment analysis on them for fun

It seems that previously there was an OzBargain.com.au/rss link that could be used to easily achieve this but now that link doesn't work.

The community seems to use: https://www.ozbargain.com.au/deals/feed but im not sure this is enough to access comment data

Could someone please advise on this?

Comments

  • We've never had a "comment feed". The deal feed you linked to gives you all the posted deals in reverse chronological order.

    • Thanks for the response, would you know the best way I could access comment data, similar to how data is gathered for the transparency reports?

      • Monthly transparent reports were generated from moderator tools & direct database access.

        One possible way to access all the comments would be using the Live feed and then filter out the comments on your side. It does not provide the content of the comment but just an URL, which you might be able to retrieve by scraping the site. Do note that excessive scraping will get your bots banned.

      • I could write a scraper and make an api for you if you'd like. I have done this before with another forum

        • That is an amazing offer! Just for my learning though I would like to try and do that myself first

          Would you able to direct me to any good resources that would instruct on how to build an ozb API?

          Thanks

          • @WilhelmBargo: I don't know python, but can tell you what I used for JavaScript. I wrote a little scraper in Node using Cheerio that I could run just from the command line. Takes a bit of work inspecting every element to work out how the page is laid out and how to get the information in the order you want it. Just printed the info to the console to make sure I could access it.

            Wrote an api using express js with mongodb for the database, hosted on heroku which is free. Had calls for adding elements to the database. Integrated that into the scraper so it would add info or edit what existed as it scraped.

            Then (and my memory is hazy), I bundled my scraper into a library to add to my server, so I could run it on the Web, and had the variables (eg specific page or person) inputable as get variables in the URL. I think I tried to find a way to have it scrape automatically a few times a day but failed at that and gave up. So had to load a URL each time I wanted it to launch a scrape.

            Then build a UI in angular and had it access the api for its data. Spent the most time on that (making it easy to use for my friend who was using it with me) and tinkering with the api but it was fairly straightforward. You probably won't need a UI, just a way to store and read the data. You could print it as plain text to the browser if you bundle it up for the server or just run something from the command line and print to the console or to a CSV file.

Login or Join to leave a comment