How can we process large data set both structured and unstructured, in most efficient form? Is this a big data challenge that requires case by case based solutions?
When we need to serve large set of data, most of the database provide data streaming architecture.
When using JDBC, we need to call Statement#setFetchSize(int) in order to give the DB/JDBC driver a hint on the number of result set rows to fetch in a single batch. As this is only hit, this may not be enough to prevent resultset object from collecting all result rows in memory or the driver waiting until all results have been collected by the DB.
For PostgreSQL, it is required to call setAutoCommit(false) on the Connection object. The reason is that PostgreSQL only enables cursors inside a transaction.
- Postgres cursor – https://jdbc.postgresql.org/documentation/head/query.html
If you have a “LIMIT” in your query, read on