As a result of advancements in data processing and storage, organizations are significantly investing in "data assets" to drive innovation. Yet, when technologies necessitate complementary insights, companies can't innovate in isolation. They need to leverage community ecosystems to navigate the challenges associated with creating datasets and benchmarks. This paper uncovers different types of data-driven innovation in communities and investigates how data impacts the balance between competition and collaboration, the creation and dissemination of benchmarks, and ultimately, the innovation outcome. Through a multi-method study of 27 data-driven innovation communities in synthetic biology, we provide a framework for innovation communities that display different levels of resolution in the tension between collaboration and competition when utilizing a range of benchmarks. We demonstrate that the openness not only motivates the development and standardization of data benchmarks, but also accentuates the pivotal role of data benchmarks in influencing the collaboration and competition dynamics. Furthermore, we compare a spectrum of data-driven innovation communities to illustrate which collaboration types are most suitable for specific kinds of data and to offer diverse strategies for incentive management.