Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-24907][SQL][WIP] Migrate JDBC DataSource to JDBCDataSourceV2 Read using DataSourceV2 API #21861

Closed
wants to merge 1 commit into from

Conversation

tengpeng
Copy link
Contributor

What changes were proposed in this pull request?

(After the update of DataSourceV2 API, this PR will be updated)

Changes: Migrate JDBC DataSource to JDBCDataSourceV2 Read using DataSourceV2 API

  • This PR is minimum viable and other JDBC features will be added soon, but it is ready for review to see if we are in the right direction.
  • This PR also contains a minor improvement on DataSourceOptions (to make it serializable).
  • It does not support DLL due to the limitations of the new API

How was this patch tested?

New unit tests.

@gatorsmile
Copy link
Member

Thanks for your PR, but we plan to propose new changes in DataSourceV2 APIs very shortly.

@tengpeng
Copy link
Contributor Author

@gatorsmile Got you. I will update the implementation after DataSourceV2 API changes.

@tengpeng tengpeng changed the title [SPARK-24907][WIP] Migrate JDBC DataSource to JDBCDataSourceV2 Read using DataSourceV2 API [SPARK-24907][SQL][WIP] Migrate JDBC DataSource to JDBCDataSourceV2 Read using DataSourceV2 API Jul 25, 2018
@xianyinxin
Copy link
Contributor

Any update on this? @tengpeng

@tengpeng
Copy link
Contributor Author

tengpeng commented Jul 13, 2019

@xianyinxin I have not followed the progress on the V2 APIs for a while, but I am still interested in implementing the V2 APIs for JDBC. I am OK to co-author if you are interested in this PR.

@shivsood
Copy link
Contributor

@tengpeng i can also help co-author here. Should we start fixing this branch or create a new branch based on latest V2? I have a local branch with with basic scaffolding for read and write paths based on latest V2. I have not push that out as yet, i can if you want to start using this. FYI to @xianyinxin @gaborgsomogyi who have also expressed their interest in authoring this.

@xianyinxin
Copy link
Contributor

Glad you have already some work on this @shivsood . Since this is a private repo which is not allowed multi people push each one's code, my suggestion is folk this repo, cherry-pick @tengpeng 's commit, rebase it based on current V2 code, apply your fix, and push your code to your own branch, and then submit pr @shivsood . Would you agree? @shivsood @tengpeng

@shivsood
Copy link
Contributor

shivsood commented Jul 16, 2019 via email

@shivsood
Copy link
Contributor

shivsood commented Jul 16, 2019 via email

@xianyinxin
Copy link
Contributor

@shivsood OK for me.

@shivsood
Copy link
Contributor

shivsood commented Jul 16, 2019 via email

@xianyinxin
Copy link
Contributor

@shivsood Accepted. Seems good.

@shivsood
Copy link
Contributor

@tengpeng Tried cherry-picking your change. It's not a direct cherry pick. Directory paths have changed and updated files also have changes. Some of your work in JDBCOptionsV2 , JDBCDataSourceV2.scala and Test is very much relevant, but needs to be adapted into the new baseline. Suggest you take a shot at manually merging and make a commit?

@shivsood
Copy link
Contributor

@gaborgsomogyi have added you as collaborator

@gaborgsomogyi
Copy link
Contributor

@shivsood thanks for considering me as collaborator. Since I'm mainly in the streaming area my main intention is to implement Structured Streaming source/sink based on the SQL implementation. Additionally you've guys already picked this up so I'm contributing in this effort with review comments.

@shivsood
Copy link
Contributor

Since I'm mainly in the streaming area my main intention is to implement Structured Streaming source/sink based on the SQL implementation. Additionally you've guys already picked this up so I'm contributing in this effort with review comment

@gaborgsomogyi Thanks. Reviews would be a great help indeed. Also look forward to your streaming source/sinks work.

@shivsood
Copy link
Contributor

@tengpeng @xianyinxin @gaborgsomogyi
Pushed a PR with scaffolding for read/write paths and first draft implementation of write( append) path based on current APIs #25211
Readme.md added for high level work items. Find it at org/apache/spark/sql/execution/datasources/v2/jdbc/Readme.md.

Feel free to comment/contribute as relevant. Thanks.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@github-actions
Copy link

github-actions bot commented Jan 9, 2020

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

@github-actions github-actions bot added the Stale label Jan 9, 2020
@github-actions github-actions bot closed this Jan 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants