Love how nobody wants to sleep tonight and we’re all still up chatting away! Glad we’re all enjoying our first hours of the weekend :)
(Except the mods though. Smh my head. Look at them being responsible, reasonable, smart adults and sleeping at a reasonable hour. Almost 3am and my post still isn’t pinned! Bloody responsible early bird people)
My excuse is that I’m badgering GPT for python code to fulfill a niche but simple purpose
I’ve also been doing a bit of research into the biology behind weight gain/loss, fat storage, ketones, nutritional requirements, etc. funky things, the body. Definitely needs a few big fixes though
Honestly it probably already exists, but I just want something that does exactly what I want and no more, and is easy to setup:
I’ve got a browser extension that extracts all URLs on a webpage and merged them all into a JSON file. I use them for archiving mass amounts of URLs onto the Wayback machine using another utility they have that archives all URLs from a google sheet.
Google sheets doesn’t allow importing a JSON file, though. So the python script takes a bunch of little JSON files with a few hundred URLs, then converts that into a CSV file that I can just import into GSheets. It’s like 60 lines of code, with a few extra bells and whistles added in for error handling. Very simple
Again just curious, but why go via sheets and not their API? Are you crawling (you mentioned a browser extension so guessing not)? You might have an unnecessary complexity if I’m picturing this correctly at this hour.
This came out longer than tnicipated and I'm a bit too smooth brained at the moment to remove all the guff and rephrase. Sorry. Not a rant! Just a Livestream of consciousness basically
I couldn’t figure out how to work their API. I got an API key and all that, but things just weren’t working
There’s a set of save page now utilities, I could use API free, but they’re all Linux shell scripts, I couldn’t figure out how to work on windows without messing around with WSL (a bit beyond my capabilities). When I tried to work them out on my MacBook, they worked but from memory not how I wanted
I also found the IAs documentation to be missing, difficult to find, or outdated in a lot of areas, as well, which meant that when I last tried to get GPT to work it out, it was trying to use deprecated API calls and an outdated authentication method, and I couldn’t make it work much better myself
Could probably give it another go. Having it take the URLs from the CSV could work. But anything before that (like crawling) doesn’t work the best because some of the things I archive require manual intervention anyway to properly extract all URLs (for instance Lemmy threads start auto collapsing after 300 comments, so they need to be expanded to retrieve comment links), or photos hidden in spoilers need to be expanded to retrieve the image url. That sort of thing. Possible to automate, but it would probably take more time to automate than if save compared to just doing it manually
I did actually attempt to get GPT to make a crawler for a completely different purpose once, and it didn’t work. I don’t remember what exactly went wrong with it, but from memory it was misinterpreting status codes and couldn’t parse the html properly. Easier to just fork somebody else’s crawler and modify it to work with the other scripts I guess
Also, importing it into a sheet doesn’t actually take that much work. It’s basically 3 mouse clicks, then heading to the IAs sheet batch archiving page and pasting in the URL. Their batch processing is a bit inefficient and can take a few days, which if done through the API could definitely be done faster and with a bit of smart logic put in to avoid going over daily archive caps and with a queueing system, but those few days don’t require any active energy on my part. They keep processing it tin the background at a rate of a row or 2 a minute, then send me an email once it’s done
Haha super kind of you, but don’t waste your lovely Saturday on helping me with my silly projects! If anything, this script is largely about helping me become more familiar and comfortable with command line tools and troubleshooting/debugging python scripts.
Eventually I’m sure the extra clicks may get annoying, or something will break in the current flow and the API is the last option, or whatever. When I do, I might very well post up here “hey @indisin you know that time 9 years ago on the 11th of January, 2025, at about 4am when you really wanted to help me with a dumb project? Your offer still stand” 😂
small tangent
I know using an LLM to write code isn’t exactly “learning” very much in the conventional sense, but honestly, I did try to actually learn python a few times, and it all just went over my head. But one of the most helpful things about ChatGPT is that, at least for now before it fully enshitifies, I can ask it as many questions as I like. I can ask it do something I’d never have any clue how to do myself, then ask it why it did specific things specific ways, what would happen if X was changed to y, or why z thing needs to be an xy position in the code. A real person would’ve blocked me a long, long time ago 😆
Always keep learning. Learning by doing and making your own mistakes in this space is more valuable than stating a degree on a CV in my opinion.
In 9 years I know that I wont get that message, and instead it will be “hey @indisin, I saw what you were thinking 9 years ago and here’s how I made it better and now I’m gonna show off and school you”.
On LLMs: I have absolutely no challenge on using them for your use case, you just need to remember they can’t have an original thought, which is something you don’t need today. If you end up pursuing this as a career though know that the money is stupid, the days are easy, and you must (absolutely must) learn UML basics to communicate effectively. Also knowing your gang of 4 will simplify your cognitive load and implementations and then domain driven design will help how you think about problems. Also, raspberry pis are perfect toys for playgrounds. All very easy money, where you can work anywhere in the world.
I’m leaving that there as we’re at the point of a call otherwise I’ll start replying as thesis levels of word count text haha.
Good luck though and most importantly have fun!
Edit:
A real person would’ve blocked me a long, long time ago
No, a good dev would’ve got excited about teaching you and you’d have had to block them haha
Love how nobody wants to sleep tonight and we’re all still up chatting away! Glad we’re all enjoying our first hours of the weekend :)
(Except the mods though. Smh my head. Look at them being responsible, reasonable, smart adults and sleeping at a reasonable hour. Almost 3am and my post still isn’t pinned! Bloody responsible early bird people)
My excuse is that I’m badgering GPT for python code to fulfill a niche but simple purpose
I’ve also been doing a bit of research into the biology behind weight gain/loss, fat storage, ketones, nutritional requirements, etc. funky things, the body. Definitely needs a few big fixes though
Curious. What’s the purpose?
Also agreed that it is fun seeing a night time crew around. I miss the check ins!
Honestly it probably already exists, but I just want something that does exactly what I want and no more, and is easy to setup:
I’ve got a browser extension that extracts all URLs on a webpage and merged them all into a JSON file. I use them for archiving mass amounts of URLs onto the Wayback machine using another utility they have that archives all URLs from a google sheet.
Google sheets doesn’t allow importing a JSON file, though. So the python script takes a bunch of little JSON files with a few hundred URLs, then converts that into a CSV file that I can just import into GSheets. It’s like 60 lines of code, with a few extra bells and whistles added in for error handling. Very simple
Again just curious, but why go via sheets and not their API? Are you crawling (you mentioned a browser extension so guessing not)? You might have an unnecessary complexity if I’m picturing this correctly at this hour.
This came out longer than tnicipated and I'm a bit too smooth brained at the moment to remove all the guff and rephrase. Sorry. Not a rant! Just a Livestream of consciousness basically
I couldn’t figure out how to work their API. I got an API key and all that, but things just weren’t working
There’s a set of save page now utilities, I could use API free, but they’re all Linux shell scripts, I couldn’t figure out how to work on windows without messing around with WSL (a bit beyond my capabilities). When I tried to work them out on my MacBook, they worked but from memory not how I wanted
I also found the IAs documentation to be missing, difficult to find, or outdated in a lot of areas, as well, which meant that when I last tried to get GPT to work it out, it was trying to use deprecated API calls and an outdated authentication method, and I couldn’t make it work much better myself
Could probably give it another go. Having it take the URLs from the CSV could work. But anything before that (like crawling) doesn’t work the best because some of the things I archive require manual intervention anyway to properly extract all URLs (for instance Lemmy threads start auto collapsing after 300 comments, so they need to be expanded to retrieve comment links), or photos hidden in spoilers need to be expanded to retrieve the image url. That sort of thing. Possible to automate, but it would probably take more time to automate than if save compared to just doing it manually
I did actually attempt to get GPT to make a crawler for a completely different purpose once, and it didn’t work. I don’t remember what exactly went wrong with it, but from memory it was misinterpreting status codes and couldn’t parse the html properly. Easier to just fork somebody else’s crawler and modify it to work with the other scripts I guess
Also, importing it into a sheet doesn’t actually take that much work. It’s basically 3 mouse clicks, then heading to the IAs sheet batch archiving page and pasting in the URL. Their batch processing is a bit inefficient and can take a few days, which if done through the API could definitely be done faster and with a bit of smart logic put in to avoid going over daily archive caps and with a queueing system, but those few days don’t require any active energy on my part. They keep processing it tin the background at a rate of a row or 2 a minute, then send me an email once it’s done
Those clicks require effort though and the dev in me would not be clicking anything.
But the dev and mentor in me is now desperately wanting to jump on a call with you to pair and so I’m closing this tab.
Haha super kind of you, but don’t waste your lovely Saturday on helping me with my silly projects! If anything, this script is largely about helping me become more familiar and comfortable with command line tools and troubleshooting/debugging python scripts.
Eventually I’m sure the extra clicks may get annoying, or something will break in the current flow and the API is the last option, or whatever. When I do, I might very well post up here “hey @indisin you know that time 9 years ago on the 11th of January, 2025, at about 4am when you really wanted to help me with a dumb project? Your offer still stand” 😂
small tangent
I know using an LLM to write code isn’t exactly “learning” very much in the conventional sense, but honestly, I did try to actually learn python a few times, and it all just went over my head. But one of the most helpful things about ChatGPT is that, at least for now before it fully enshitifies, I can ask it as many questions as I like. I can ask it do something I’d never have any clue how to do myself, then ask it why it did specific things specific ways, what would happen if X was changed to y, or why z thing needs to be an xy position in the code. A real person would’ve blocked me a long, long time ago 😆
Always keep learning. Learning by doing and making your own mistakes in this space is more valuable than stating a degree on a CV in my opinion.
In 9 years I know that I wont get that message, and instead it will be “hey @indisin, I saw what you were thinking 9 years ago and here’s how I made it better and now I’m gonna show off and school you”.
On LLMs: I have absolutely no challenge on using them for your use case, you just need to remember they can’t have an original thought, which is something you don’t need today. If you end up pursuing this as a career though know that the money is stupid, the days are easy, and you must (absolutely must) learn UML basics to communicate effectively. Also knowing your gang of 4 will simplify your cognitive load and implementations and then domain driven design will help how you think about problems. Also, raspberry pis are perfect toys for playgrounds. All very easy money, where you can work anywhere in the world.
I’m leaving that there as we’re at the point of a call otherwise I’ll start replying as thesis levels of word count text haha.
Good luck though and most importantly have fun!
Edit:
No, a good dev would’ve got excited about teaching you and you’d have had to block them haha