Core Java

How to crawl websites with Selenide and JDK 14+

Sometimes we find ourselves in a situation in which we need certain data, that needs to be manually fetched from some website. As developers, of course automation is our friend, which is why we can write some automated approach to crawl websites, instead of searching all this information ourselves. I’ve recorded a video, in which I’m fetching up some data from my blog website and transform it into a CSV format, by using Selenide and some new Java features such as Records.

Please keep in mind to be a nice citizen and only use such techniques for websites and situations where you’re allowed to do so, and where your actions don’t disrupt any service.

You can find the code example on GitHub: Selenium Playground

What we’re doing is to use Selenide with it’s helpful queries and methods, and Java Records and Streams to map the entries of my blog to a desired output format. The difference to using a web API is that we have to be a bit more creative in how we identify and get the individual parts, since the data is not necessarily structured for automated consumption.

Published on Java Code Geeks with permission by Sebastian Daschner, partner at our JCG program. See the original article here: How to crawl websites with Selenide and JDK 14+

Opinions expressed by Java Code Geeks contributors are their own.

Do you want to know how to develop your skillset to become a Java Rockstar?
Subscribe to our newsletter to start Rocking right now!
To get you started we give you our best selling eBooks for FREE!
1. JPA Mini Book
2. JVM Troubleshooting Guide
3. JUnit Tutorial for Unit Testing
4. Java Annotations Tutorial
5. Java Interview Questions
6. Spring Interview Questions
7. Android UI Design
and many more ....
I agree to the Terms and Privacy Policy

Sebastian Daschner

Sebastian Daschner is a self-employed Java consultant and trainer. He is the author of the book 'Architecting Modern Java EE Applications'. Sebastian is a Java Champion, Oracle Developer Champion and JavaOne Rockstar.
Subscribe
Notify of
guest


This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Back to top button