Ai Cheat Sheet
  • Home
  • Statistics ↓↑
    • Types of Measure
    • Population and Sample
    • Outliers
    • Variance
    • Standard Deviation
    • Skewness
    • Percentiles
    • Deciles
    • Quartiles
    • Box and Whisker Plots
    • Correlation and Covariance
    • Hypothesis Test
    • P Value
    • Statistical Significance
    • Bootstrapping
    • Confidence Interval
    • Central Limit Theorem
    • F1 Score (F Measure)
    • ROC and AUC
    • Random Variable
    • Expected Value
    • Central Limit Theorem
  • Probability ↓↑
    • What is Probability
    • Joint Probability
    • Marginal Probability
    • Conditional Probability
    • Bayesian Statistics
    • Naive Bayes
  • Data Science ↓↑
    • Probability Distribution
    • Bernoulli Distribution
    • Uniform Distribution
    • Binomial Distribution
    • Poisson Distribution
    • Normal Distribution
    • T-SNE
  • Data Engineering ↓↑
    • Data Science vs Data Engineering
    • Data Architecture
    • Data Governance
    • Data Quality
    • Data Compliance
    • Business Intelligence
    • Data Modeling
    • Data Catalog
    • Data Cleaning
    • Data Format
      • Apache Avro
    • Tools
      • Data Fusion
      • Dataflow
      • Dataproc
      • BigQuery
    • Cloud Platforms
      • GCP
    • SQL
      • ACID
      • SQL Transaction
      • Query Optimization
    • Data Engineering Interview Questions
  • Vector and Matrix
    • Vector
    • Matrix
  • Machine Learning ↓↑
    • L1 and L2 Loss Function
    • Linear Regression
    • Logistic Regression
    • Naive Bayes Classifier
    • Resources
  • Deep Learning ↓↑
    • Neural Networks and Deep Learning
    • Improving Deep Neural Networks
    • Structuring Machine Learning Projects
    • Convolutional Neural Networks
    • Sequence Models
    • Bias
    • Activation Function
    • Softmax
    • Cross Entropy
  • Natural Language Processing ↓↑
    • Linguistics and NLP
    • Text Augmentation
    • CNN for NLP
    • Transformers
      • Implementation
  • Computer Vision ↓↑
    • Object Localization
    • Object Detection
    • Bounding Box Prediction
    • Evaluating Object Localization
    • Anchor Boxes
    • YOLO Algorithm
    • R-CNN
    • Face Recognition
  • Time Series
    • Resources
  • Reinforcement Learning
    • Reinforcement Learning
  • System Design
    • SW Diagramming
    • Feed
  • Tools
    • PyTorch
    • Tensorflow
    • Hugging Face
  • MLOps
    • Vertex AI
      • Dataset
      • Feature Store
      • Pipelines
      • Training
      • Experiments
      • Model Registry
      • Serving
        • Batch Predictions
        • Online Predictions
      • Metadata
      • Matching Engine
      • Monitoring and Alerting
  • Interview Questions ↓↑
    • Questions by Shared Experience
  • Contact
    • My Personal Website
Powered by GitBook
On this page

Was this helpful?

  1. Data Engineering ↓↑

Data Governance

Data governance refers to the overall management of the availability, usability, integrity, and security of the data used within an organization. It is a framework of policies, procedures, and standards that ensure that data is reliable, accurate, and consistent across the enterprise.

In the context of data engineering, data governance refers to the processes and practices that ensure that the data used by an organization is of high quality and meets the needs of the business. This includes defining data quality standards, ensuring that data is properly documented and labeled, and establishing processes for data access and usage.

For example, let's consider a scenario where a company wants to implement data governance practices for its customer data. The first step would be to define what data quality means for this organization. This might include ensuring that customer data is complete, accurate, and up-to-date.

Next, the company would establish processes for ensuring data quality. This might include setting up automated data validation checks, implementing data entry standards, and providing training to employees who handle customer data.

The company would also need to establish policies and procedures for data access and usage. For example, they might set up data access controls to ensure that only authorized employees can access customer data, and establish protocols for how the data can be used and shared.

Overall, data governance is a critical aspect of data engineering as it helps ensure that the data used by an organization is reliable, accurate, and secure.

Categories

  1. Data Strategy

  2. Data Management

  3. Data Quality

  4. Data Operations

  5. Data Platforms

  6. Supporting Process

PreviousData ArchitectureNextData Quality

Last updated 2 years ago

Was this helpful?