-
-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiplatform #209
base: master
Are you sure you want to change the base?
Multiplatform #209
Conversation
Some tasks are still not entirely the same and will be updated later
A few tests still need fixing up caused by formatting issues
Using JSDOM for emulation. Since this option is rather slow, it will only be used when jsExecution is requested In the browser the default implementation via DOMParser will be used In node linkedom will be used, since it's implementation is more resource friendly than JSDOM and perfectly suitable for web scrapping Remaining issues are formatting related and will require a custom formatter to match the output of Jsoup
All tests are now passing on the JS side
# Conflicts: # assertions/build.gradle.kts # build.gradle.kts # buildSrc/src/main/kotlin/Deps.kt # dsl/build.gradle.kts # examples/android/app/build.gradle # examples/scraping/build.gradle.kts # examples/use-pre-release-version/build.gradle.kts # fetcher/async-fetcher/build.gradle.kts # fetcher/async-fetcher/src/jvmMain/kotlin/it/skrape/fetcher/AsyncFetcher.kt # fetcher/base-fetcher/build.gradle.kts # fetcher/browser-fetcher/build.gradle.kts # fetcher/browser-fetcher/src/jvmMain/kotlin/it/skrape/fetcher/logging/LoggingConfigurator.kt # fetcher/browser-fetcher/src/jvmMain/resources/META-INF/services/ch.qos.logback.classic.spi.Configurator # fetcher/browser-fetcher/src/jvmMain/resources/logback.xml # fetcher/http-fetcher/build.gradle.kts # fetcher/http-fetcher/src/jvmMain/kotlin/it/skrape/fetcher/HttpFetcher.kt # gradle.properties # html-parser/build.gradle.kts # html-parser/src/commonMain/kotlin/it/skrape/selects/DomTreeElement.kt # integrationtests/build.gradle.kts # ktor-extension/build.gradle.kts # mock-mvc-extension/build.gradle.kts # settings.gradle.kts # test-utils/build.gradle.kts
Upgraded language version and apiTarget to 1.7
Introduced karma middleware to allow test containers to be spun up from the browser Reworked wiremock stubs to use JSON api and ktor client for multiplatform compatibility Temporarily allow TestContainers to be run on Windows for testing (and while we figure out how to block them on certain platforms)
NodeJS is now working using testcontainers directly Block testcontainers from being bundled with webpack Improved robustness of karma testcontainer middleware Expose ROOT_PROJECT_PATH to NodeJS tests
…ultiplatform � Conflicts: � fetcher/browser-fetcher/build.gradle.kts
Wow this is great news 🤩 Let me know if there are any questions that can maybe help to understand internals of the Lib if needed. I personally have no experience so far with creating multiplattform Libs in kotlin. In my daily life I am very much into using kotlin on the jvm/server-side. I am really glad if skrape{it} could become a multiplatform Lib 🥳 |
My main question right now would be if multiple fetchers are really necessary. I geuss the question is if that is okay with you and if that's a direction you feel comfortable with? |
Hey, sorry for late response.
the different fetchers (beside BrowserFetcher which has a little more special purpose) only exists because of convenience for the users to either run blocking or non-blocking.
if this could be simplified and as you mentioned "fetching can only be done via a suspend function on non JVM platforms" anyway i would apprechiate simplification (reducing it to 1) of the fetchers a lot as long as we provide an intuitive way to still make blocking fetches on the JVM possible. either by at least documenting that wrapping calls in a runBlocking scope is necessary or preferred provide something like a just a quick idea somewhat inspired by what kohttp dsl is doing here, but i am open for other ideas as well:
i am totally fine to switch to ktor client completely as a build-in/default as long if we still provide a way for users to implement custom fetchers to keep the lib flexible and not completely dependent on ktor. i am not deep into ktor client, if it is possible to implement custom ktor clients anyway that would be good enough.
That sounds resonable to me and would again simplify the code base a lot =) i like! Last but not least the resaon why Browserfetcher is an own fetcher but also part of the html-parser
overall absolutely =) |
Thanks for clearing all that up! It might even be possible to replace Skrape.it's own Request class with the Ktor Request builder, since they both serve the same purpose. Could you maybe provide some tests in which the BrowserFetcher specific features are used so I can make sure it still fulfills those requirements? |
The fix is more of a hack that redirects any https request to the http url. Since this is only used for testing and ssl validation is always relaxed it should be fine
Codecov Report
@@ Coverage Diff @@
## master #209 +/- ##
==========================================
- Coverage 83.18% 0.00% -83.17%
==========================================
Files 39 40 +1
Lines 1141 1173 +32
Branches 173 145 -28
==========================================
- Hits 949 0 -949
- Misses 131 1173 +1042
+ Partials 61 0 -61
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
Hey, But anyway I took a week off in mid November and will try to provide some tests with the browserfetcher specific features :)
That sounds absolutely fantastic 🙂
Tbh I like the look and feel of skrape{it}s request dsl more than ktors. But we could map skrape{it} request dsl to ktor request object internally |
Test coverage for JVM works again JS test coverage is not included and the report seems to not work anyway
…tter use with multiplatform
For now it only affects downgrading when following redirects. sslRelaxed might not be possible on a per-request basis and may need to be a property of the scraper itself
Deprecated old Scraper methods. Still need to be mapped to the new KtorScraper
JVM works as before except a proxy setting for the request. Will need further testing if it's possible on a per-request basis JS is pretty difficult, especially in the browser. Some features like disabling redirects and proxies won't be supported.
Replaced Atrium with Kotest-Assertions in preparation for a move to the Kotest Framework for more fine-control over test execution Switched JS builds to use the IR-Compiler Updated kotlin version to 1.7.21 Modified test to match new Request style
Switched to Kotest Improved resources for multiplatform tests Fixed out of spec behaviour in css selector generation where attributes where enclosed in quotes Some tests had to be disabled on JS for causing issues JS Web will require proxy support to be truly usable
Hi! So I've been working along on the multiplatform/JS implementation of skrape.it. It's fully functional on JS now and I'd say I'm quite close to completion. The major missing feature right now is Proxy support on JS. Because of some of the (breaking) changes I've made this would probably be better suited for a major realease. Here is a (non inclusive) list of what I've changed since the last time: Breaking Changes
Notable Changes
Known issues and limitations
TODOs
If you have the time I'd be very happy if you could look the changes over and give me some feedback if you are happy with the direction this is going. Cheers |
hey @McDjuady this is such awesome news. a really really big thank you for all the your effort so far. you rock! first of all don't worry because of breaking changes. we will release a new major version :) i will have a look at the code at latest until monday. job is much more chilled again currently^^ PS: skrape.it docs are hosted via gitbook.com, can not see or do much over there since its SaaS. side-note: i thought about having a better documentation site. i played around with kotlin-playground) in the past. the idea was to have kind of an interactive docu like the official kotlin docs where you can try out code snippets directly. i wanted to work on that again soon, therefore i wouldn't like to invest to much time in the old docs setup / want only add documentation there. |
…recations Updated tests to use new API
again sorry for late response. i just reviewed what you did so far. looks good to me 👍 |
Any update on this? |
Hey @McDjuady, |
Is this dead? 😢 |
at least for this PR - it looks like. any help help or pull requests welcome |
I've been working over the last few months to create a multiplatform version for skrape.it
It's somewhat in line with #196 but I'm a bit further along in some areas, which is why i wanted to get this out.
So far I've converted the buildscripts to multiplatform and implemented some modules in JS.
This is still pretty much WIP and I intend to keep working on it. I'll update the pull request as I get further along and improve the code
What's done so far:
What still needs to be done:
Other notable changes: