'bigdata' 태그의 글 목록

GPU 없이 생성형 AI 사용하기 - Pandasai 사용 방법 & 후기

오늘은 OpenAI에서 개발한 생성형 AI 라이브러리, Pandasai 사용 방법과 후기에 대해 적어보겠습니다.먼저 Pandasai는 Pandas DataFrame 데이터의 시각화, 전처리, 분석을 프롬프트 형태로 할 수 있도록 도와주는 라이브러리입니다.Pandasai는 생성형 AI처럼 프롬프트를 입력하는 방식으로 작동됩니다.기존의 chatGPT나 최근의 딥시크와 같은 생성형 AI와의 차이점은 코드 안에서 바로 실행이 가능하다는 것입니다.사용하는 방법도 비교적 간단하기 때문에 쉽게 따라하실 수 있을 것 같아요. How toOpenAI에서 신용카드(해외 결제 가능)를 등록 : https://platform.openai.com/settings/organization/billing/payment-methods..

Data/Python 2025.02.12

[Udemy] Data Engineering 101: The Beginner's Guide - Data Pipeline architecture(1)

data architecturewhat is good data architectureperformance : using computing and storage resources efficientlytrade-off between performance and complexityscalability : data volumes = fluctuateupstream system fail → increasing data volumesscale up/down should be automatical : scale-down can save a lot of moneyreliability : available system & avoid failureAutomate as much as possible → reduce huma..

Data/Data Engineering 2025.01.29

[Udemy] Data Engineering 101: The Beginner's Guide - Undercurrents

DataOpsDevOps for dataDevOps : deploy software in a more iterative & robust mannerbuild, manage cloud infraobservability of cloud infrabuild automated CI(Continuous Integration)/CD(Continuous Deployment) PipelineDataOps : data product deployments more iterative and robustbuild, manage cloud infra for data toolsobservability of data systems(incident reporting and notifications of problems)automat..

Data/Data Engineering 2025.01.24

[Udemy] Data Engineering 101: The Beginner's Guide - End-to-end data pipeline in-depth(2)

IngestionIngestion = moving or ingesting datafrequencybatch vs streamingbatch : slower = daily or hourlystreaming : faster = seconds to sub-seconds. real-timemicro-batch : combination of batch and streamingBatch ingestionconvenientless latencymore forgiving TypeETL : Extract → Transform → Loadtraditional data warehouse : clean → put DWwhy ETL needs cleaning? DW is expensive!most commonELT : Extr..

Data/Data Engineering 2025.01.18

[Udemy] Data Engineering 101: The Beginner's Guide - End-to-end data pipeline in-depth(1)

Generation of source datastructured / unstructured : differences in store, search..structured data : tabular, 2-demensional(rows and columns)use SQLBI, classical MLunstructured data : filesuse Deep Learning(Neural Networks)database : if choose wrong database, suffer from performanceRDBMS : Relationaltransactional data, tabular formatrelation between tablesinflexible, strict, normalizedsingle mac..

Data/Data Engineering 2025.01.12

[Udemy] Data Engineering 101: The Beginner's Guide - Intro

입사한 지 벌써 6개월 차다.데이터 엔지니어링을 직접 하지는 못하더라도 데이터 엔지니어링이 무엇인지, 무슨 일을 하는지, 무엇을 중요하게 여기는지 정도는 알아둬야 할 것 같다는 생각이 들었다. 그래서 udemy에서 Data Engineering 101 강의를 듣기 시작했다. 복습 겸 써보는 포스팅! Data EngineeringWhy Data Engineering is important? : Big data requires efficient data handlingData Engineerwithin data team : bridge between data producers and data consumersdata producer : software engineers and DevOps engineers ..

Data/Data Engineering 2025.01.05

Carat Thinker

bigdata 6

티스토리툴바