{"id":2306,"date":"2022-10-03T11:33:17","date_gmt":"2022-10-03T15:33:17","guid":{"rendered":"https:\/\/shirishranjit.com\/blog1\/?page_id=2306"},"modified":"2023-04-14T16:27:16","modified_gmt":"2023-04-14T20:27:16","slug":"object-storage-design-principles","status":"publish","type":"page","link":"https:\/\/shirishranjit.com\/blog1\/technical-posts\/object-storage-design-principles","title":{"rendered":"Object Storage Design Principles"},"content":{"rendered":"\n<p><\/p>\n\n\n\n<p><\/p>\n\n\n\n<p>A data lake can be broadly categorized across four distinct buckets:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Raw data<\/strong>&nbsp;\u2013 Data received from the source without any change. This data shall be immutable. The data formates include structured, semi structured, and unstructured data objects such as databases, backups, archives, JSON, CSV, XML, text files, or images.<\/li>\n\n\n\n<li><strong>Transformed&nbsp;<\/strong>\u2013 When a data is processed such as normalized to a specific use case for performance improvement and cost reduction. Also, data may be transformed into columnar data formats, such as Apache Parquet and Apache ORC, which can be used by Amazon Athena. In this case, data is only transformed but not cleansed.<\/li>\n\n\n\n<li><strong>Curated<\/strong>&nbsp;\u2013The data is further enriched by blending it with other data sets to provide additional insights. At this stage the data are cleansed and optimized for analytics, reporting, and so on.<\/li>\n<\/ul>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/shirishranjit.com\/blog1\/wp-content\/uploads\/2022\/09\/image.png\"><img loading=\"lazy\" decoding=\"async\" width=\"947\" height=\"479\" src=\"https:\/\/shirishranjit.com\/blog1\/wp-content\/uploads\/2022\/09\/image.png\" alt=\"\" class=\"wp-image-2307\" srcset=\"https:\/\/shirishranjit.com\/blog1\/wp-content\/uploads\/2022\/09\/image.png 947w, https:\/\/shirishranjit.com\/blog1\/wp-content\/uploads\/2022\/09\/image-300x152.png 300w, https:\/\/shirishranjit.com\/blog1\/wp-content\/uploads\/2022\/09\/image-768x388.png 768w, https:\/\/shirishranjit.com\/blog1\/wp-content\/uploads\/2022\/09\/image-500x253.png 500w\" sizes=\"(max-width: 947px) 100vw, 947px\" \/><\/a><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Referencess:<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>AWS S3 &#8211; https:\/\/docs.aws.amazon.com\/whitepapers\/latest\/building-data-lakes\/data-lake-foundation.html<\/li>\n\n\n\n<li><\/li>\n<\/ul>\n<div class=\"twttr_buttons\"><div class=\"twttr_twitter\">\n\t\t\t\t\t<a href=\"http:\/\/twitter.com\/share?text=Object+Storage+Design+Principles\" class=\"twitter-share-button\" data-via=\"\" data-hashtags=\"\"  data-size=\"default\" data-url=\"https:\/\/shirishranjit.com\/blog1\/technical-posts\/object-storage-design-principles\"  data-related=\"\" target=\"_blank\">Tweet<\/a>\n\t\t\t\t<\/div><div class=\"twttr_followme\">\n\t\t\t\t\t\t<a href=\"https:\/\/twitter.com\/shiranjit\" class=\"twitter-follow-button\" data-size=\"default\"  data-show-screen-name=\"false\"  target=\"_blank\">Follow me<\/a>\n\t\t\t\t\t<\/div><\/div>","protected":false},"excerpt":{"rendered":"<p>A data lake can be broadly categorized across four distinct buckets: Referencess:<\/p>\n","protected":false},"author":4,"featured_media":0,"parent":198,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-2306","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/shirishranjit.com\/blog1\/wp-json\/wp\/v2\/pages\/2306"}],"collection":[{"href":"https:\/\/shirishranjit.com\/blog1\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/shirishranjit.com\/blog1\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/shirishranjit.com\/blog1\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/shirishranjit.com\/blog1\/wp-json\/wp\/v2\/comments?post=2306"}],"version-history":[{"count":4,"href":"https:\/\/shirishranjit.com\/blog1\/wp-json\/wp\/v2\/pages\/2306\/revisions"}],"predecessor-version":[{"id":2440,"href":"https:\/\/shirishranjit.com\/blog1\/wp-json\/wp\/v2\/pages\/2306\/revisions\/2440"}],"up":[{"embeddable":true,"href":"https:\/\/shirishranjit.com\/blog1\/wp-json\/wp\/v2\/pages\/198"}],"wp:attachment":[{"href":"https:\/\/shirishranjit.com\/blog1\/wp-json\/wp\/v2\/media?parent=2306"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}