Frequently asked questions and answers of Data Partitioning and Sharding in Cloud Computing of Computer Science to enhance your skills, knowledge on the selected topic. We have compiled the best Data Partitioning and Sharding Interview question and answer, trivia quiz, mcq questions, viva question, quizzes to prepare. Download Data Partitioning and Sharding FAQs in PDF form online for academic course, jobs preparations and for certification exams .
Intervew Quizz is an online portal with frequently asked interview, viva and trivia questions and answers on various subjects, topics of kids, school, engineering students, medical aspirants, business management academics and software professionals.
Question-1. How is high availability ensured in sharded systems?
Answer-1: Using replication, failover mechanisms, and data redundancy across nodes.
Question-2. What happens when a shard becomes too large?
Answer-2: It may need to be split, rebalanced, or moved to another node for load distribution.
Question-3. What is consistent hashing?
Answer-3: Consistent hashing distributes data across nodes such that minimal reorganization is needed when nodes are added or removed.
Question-4. Why are cross-shard joins discouraged?
Answer-4: They are complex and slow because data must be retrieved and combined from multiple shards.
Question-5. What is the CAP theorem in context of sharded databases?
Answer-5: It highlights trade-offs between Consistency, Availability, and Partition tolerance in distributed systems.
Question-6. What is eventual consistency?
Answer-6: A model where updates to data propagate to all nodes over time, rather than immediately.
Question-7. How is data integrity maintained in a sharded system?
Answer-7: Through constraints, application-level logic, and database mechanisms like transactions and replication.
Question-8. What is elastic scaling in sharding?
Answer-8: It refers to adding or removing shards dynamically to adapt to changing workloads.
Question-9. What is multi-tenancy in sharding?
Answer-9: Each shard may serve a different customer (tenant), isolating their data for scalability and security.
Question-10. What tools support sharding in practice?
Answer-10: Tools like MongoDB, Vitess (for MySQL), Citus (for PostgreSQL), and Cosmos DB support native or plugin-based sharding.
Question-11. What is data partitioning?
Answer-11: Data partitioning is the process of dividing a database into distinct, independent parts to improve performance, manageability, and scalability.
Question-12. What is sharding?
Answer-12: Sharding is a type of partitioning that distributes data across multiple machines or database instances.
Question-13. What is the primary goal of sharding?
Answer-13: The main goal is to scale out a database system to handle large amounts of data and high throughput by distributing the load.
Question-14. How is horizontal partitioning different from vertical partitioning?
Answer-14: Horizontal partitioning splits rows into different tables, while vertical partitioning splits columns into different tables.
Question-15. What is a shard key?
Answer-15: A shard key is the field used to determine how data is distributed across shards.
Question-16. What is range-based sharding?
Answer-16: Range-based sharding divides data into shards based on continuous ranges of the shard key.
Question-17. What is hash-based sharding?
Answer-17: Hash-based sharding uses a hash function on the shard key to distribute data evenly across shards.
Question-18. What is directory-based sharding?
Answer-18: Directory-based sharding uses a lookup table to map each key to its corresponding shard.
Question-19. What are the advantages of sharding?
Answer-19: Sharding improves performance, enables horizontal scaling, and enhances fault isolation.
Question-20. What are the challenges of sharding?
Answer-20: Challenges include increased complexity, rebalancing data, maintaining consistency, and cross-shard queries.
Question-21. What is re-sharding?
Answer-21: Re-sharding is the process of redistributing data among shards, typically needed when a shard becomes too large or hot.
Question-22. What is a hot shard?
Answer-22: A hot shard is one that receives disproportionately high traffic, causing performance bottlenecks.
Question-23. How does auto-sharding work?
Answer-23: Auto-sharding automatically distributes data and manages shards behind the scenes without user intervention.
Question-24. What is a shard map?
Answer-24: A shard map keeps track of which data resides on which shard.
Question-25. What is vertical sharding?
Answer-25: Vertical sharding splits a database schema into different logical sections and distributes them across shards.
Question-26. How do you choose a good shard key?
Answer-26: A good shard key ensures even distribution, minimizes cross-shard operations, and aligns with query patterns.
Question-27. What is a cross-shard query?
Answer-27: A cross-shard query is one that requires data from multiple shards, which can degrade performance.
Question-28. How is data consistency handled in sharded systems?
Answer-28: It can be handled using distributed transactions, eventual consistency, or replication mechanisms.
Question-29. Can you shard a SQL database?
Answer-29: Yes, SQL databases like MySQL and PostgreSQL can be sharded, though it may require manual configuration or third-party tools.
Question-30. Can NoSQL databases be sharded?
Answer-30: Yes, most NoSQL databases like MongoDB, Cassandra, and HBase support sharding natively.
Question-31. What is a partition key?
Answer-31: A partition key is the field used to determine how data is grouped within a partition.
Question-32. How does partitioning help in query optimization?
Answer-32: Partitioning allows queries to scan only relevant partitions, reducing I/O and improving speed.
Question-33. What is the difference between partitioning and sharding?
Answer-33: Partitioning can occur within a single database instance, while sharding distributes data across multiple instances or nodes.
Question-34. What are composite shard keys?
Answer-34: Composite shard keys consist of multiple fields to provide better distribution and query support.
Question-35. How do cloud databases handle sharding?
Answer-35: Cloud databases often provide built-in auto-sharding and transparent partitioning to scale with demand.
Question-36. What is federated database architecture?
Answer-36: It refers to a system where multiple databases appear as a single logical database but operate independently.
Question-37. What is global secondary indexing in sharded systems?
Answer-37: It allows indexing of fields other than the shard key to support efficient queries across shards.
Question-38. What is range partitioning?
Answer-38: It divides data based on continuous ranges of values, such as date ranges or ID intervals.
Question-39. What is list partitioning?
Answer-39: List partitioning uses predefined lists of values to distribute rows among partitions.
Question-40. What is hash partitioning?
Answer-40: Hash partitioning uses a hash function on one or more columns to determine the target partition.
Question-41. What is composite partitioning?
Answer-41: Composite partitioning combines two partitioning methods, such as range-hash or range-list.
Question-42. What is sub-partitioning?
Answer-42: Sub-partitioning is applying a second level of partitioning within each primary partition.
Question-43. What is a partitioned table?
Answer-43: A partitioned table is a table whose data is physically separated into multiple storage segments or partitions.
Question-44. What is the use of metadata in sharded databases?
Answer-44: Metadata stores the mapping of shard keys to shard locations and helps route queries appropriately.
Question-45. What is logical vs. physical partitioning?
Answer-45: Logical partitioning is how data is organized conceptually, while physical partitioning refers to actual storage layout.
Question-46. How do you monitor a sharded database?
Answer-46: Monitoring involves tracking shard health, load balancing, replication lag, and query performance.
Question-47. What is key-based routing?
Answer-47: Key-based routing uses the shard key to route queries directly to the correct shard.
Question-48. What are some drawbacks of sharding?
Answer-48: Drawbacks include operational complexity, difficult joins, and challenges in global transactions.
Question-49. Can indexes be used in partitioned tables?
Answer-49: Yes, local and global indexes can be used depending on the database system and use case.
Question-50. What is partition pruning?
Answer-50: Partition pruning eliminates irrelevant partitions during query execution to improve performance.
Frequently Asked Question and Answer on Data Partitioning and Sharding
Data Partitioning and Sharding Interview Questions and Answers in PDF form Online
Data Partitioning and Sharding Questions with Answers
Data Partitioning and Sharding Trivia MCQ Quiz