11-26-2024 01:10 AM
Hello everyone!
I would like to know if there is any tool or API to see how many discussions have been opened year by year on forums.ni.com with the label "LabVIEW". I am interested in historical data up until 1998, if possible. I am looking for something like SEDE for StackOverflow (StackExchange - Query Stack Overflow - Stack Exchange Data Explorer).
Do you have any ideas?
11-26-2024 01:37 AM
Hi dvrubins,
@dvrubins wrote:
I would like to know if there is any tool or API to see how many discussions have been opened year by year on forums.ni.com with the label "LabVIEW". I am interested in historical data up until 1998, if possible.
Do you have any ideas?
See how far you get!
11-26-2024 01:47 AM
Some thoughts:
11-26-2024 02:22 AM
11-26-2024 03:17 AM
I have a few reservations about this method. One is that it ends up being a scraper. I am afraid that this may violate the terms of service of the NI forum. Although I haven't found anything on the Service Terms - NI page that says this. Other forums consider this type of scraping to be an invasion of privacy or piracy. My intentions are only to get the final amount of discussion at the end of the year.
With your method I could only scrape HTML tags with the 'local-date' class, but as I said, I'm not sure if it doesn't violate anything ^^'.
11-26-2024 03:28 AM
I tried to go page by page, but it didn't work. The suggestion to jump through the pages is not amusing because the discussions, although packed by year, are sorted by """relevenace""".
I will look into the Khoros API to automate the process. Do you have a link to the API for Khoros on NI forum?
Thanks for the link to JB's post. I'll try to ask him about his methodology if I can't solve it via Khoros.
11-26-2024 03:35 AM
Hi dvrubins,
@dvrubins wrote:
One is that it ends up being a scraper.
Yes.
IMHO it doesn't matter if I load the website by clicking a button/link in the browser or by loading the same HTML code using a HTTPRead function.
As long as you don't intend to create a DDoS tool (aka include some small waits) you should be free to load all the forum pages and "read" (aka parse) them…
11-26-2024 06:06 AM
@dvrubins wrote:
I tried to go page by page, but it didn't work. The suggestion to jump through the pages is not amusing because the discussions, although packed by year, are sorted by """relevenace""".
I don't understand your sentence. The threads are sorted in reverse chronological order of the latest reply. While this means they're not sorted by start date, most threads are usually only active for a short time, so you can treat the thread starting time and the thread ending time as the same time and ignore the outliers. The suggestion to jump between pages was serious, because each year would span multiple pages (and again, you know how many threads are on each page). You can also assume the threads on the LabVIEW board are the LabVIEW discussions. They're not the only ones, and they're not all about LV, but you're not going to get 100% accurate numbers anyway.
You can also change the threads per page to be up to 200, which will make the number of pages smaller, although to maintain the setting, you would to do that in a session where you set that.
@dvrubins wrote:
Do you have a link to the API for Khoros on NI forum?
I don't, but look at their website. I haven't used the API since Khoros was called Lithium (quite a few years by now), so I don't know what their current API looks like. I'm assuming they still have one.