In a previous blog, I wrote about why it’s so important to select the right partner for your data encryption needs. Now, I’d like to cover the five questions every encryption vendor must be able to answer about big data security. And if you don’t like their response, by all means, move on:
Does the solution give you full control over your keys, even as data flows from one system to another?
It’s often said that key management is the hardest part of data encryption. That’s because there’s often a lack of clarity around key management and access. When evaluating encryption vendors, be sure to ask what types of key control policies can be established to prevent unauthorized access, and always be sure the data owner, not the cloud provider or other administrator, has complete control of the encryption keys.
Does the encryption solution allow for separation of duties between authorized personnel and systems administrators?
What good is data encryption if everyone, whether they need it or not, has access to the encrypted data? Proper, policy-controlled key management allows for separation of duties that allows system and cloud administrators to perform their jobs but restricts them from accessing encrypted data. The most important part of key management is ensuring the keys do not reside on the same server as the encrypted data. This is akin to locking your car and leaving the keys in the driver’s side door.
Does the solution work in mixed IT environments where data is stored in public and private clouds as well as in an on-premises data center?
Look for a software-based encryption solution that performs just as well in a data center as it does in the cloud. Remember that regardless of where the data is stored, it’s important that the data owner, not the hosting provider, retain possession and management of the crypto keys. If your encryption solution doesn’t allow you to manage the keys, then look elsewhere.
Has the solution been tested and/or benchmarked on the applications running in your environment?
Most large organizations utilize a variety of database applications from the more traditional like MySQL and PostgreSQL to newer big data apps like Cassandra, MongoDB and HBase. To ensure your encryption utility functions cross-platform and meets your performance standards, ask your provider whether they’ve tested against the databases that are most important to you.
Does the solution use NIST-approved encryption algorithms?
The National Institute of Standards and Technology Computer Security Division publishes security requirements, FIPS 140-2, for cryptographic modules. If your vendor solution uses FIPS-validated crypto modules, you can feel confident in the strength of their cryptographic algorithm.
The bottom line is, you don't need to employ cryptographic experts to secure your data and meet HIPAA, FIPS, FERPA or PCI compliance initiatives. All it takes is a trusted security partner who can help you get where you need to go.
This week at the Gartner Security & Risk Management Summit, analyst Jay Heiser revealed his Top 10 IT Security Myths. While the article doesn’t address big data security specifically, you can easily extrapolate that these myths, if they continue to proliferate, will only get worse as the volume, velocity, variety and sensitive nature of the data grows.
Check the myths out when you get a chance, and I’m sure you’ll nod in approval at some and raise an eyebrow at others. Following up on my encryption myths blog from a few weeks back, I’d like to address Gartner’s IT Security Myth number ten, which covers the same topic:
I agree that encryption is not a magic bullet for keeping sensitive files safe, but it can be your greatest asset if implemented properly. So while the premise for the myth is sound, I want to challenge the supposed “cure” for this myth (how do you cure a myth anyway?).
At Gazzang, we talk to customers each and every day about data encryption and key management. These are IT guys, security experts, engineers, database admins, and architects. All are well incredibly knowledgeable about what they do, and they typically understand some basic concepts of encryption as well. But to expect a customer to have “solid experience in cryptography” is a little much.
A better approach is to partner with a company that has cryptographic and key management expertise AND understands how to implement this security in complex big data environments.
But in order to select the right encryption partner, you must first know what questions to ask. In part two of this blog, we’ll look at five questions that every encryption vendor must be able to answer. Stay tuned.
One of the best things about my job is being able to talk to customers and prospects about the Gazzang data security suite. I really enjoy hearing how people are using technology to meet their business goals. It’s interesting think about how our solutions fit into the big picture, whether that means locking down a big data Hadoop cluster or helping a small application provider meet HIPAA requirements.
Of course the majority of feedback comes directly from existing customers: what’s working well for them and how we can optimize our solutions to meet their unique needs. For example, a few customers have expressed interest in using Gazzang zNcrypt to provide encrypted file systems for general-purpose work, without having to setup specific process based ACL’s.
Customers want to spin up a new cloud instance, create an encrypted directory /data/encrypted, load and store everything in this directory, and then throw whatever tools at it that they need to achieve the outcome. In this case they’re looking for the assurance of at-rest encryption without having to tweak the process access control list each time they point a new utility at the data.
We took that feedback and ran with it, and we’re pleased to announce this feature is now available in the current version of Gazzang zNcrypt. We call it wildcard support (*) in the ACL rules. Wildcard support delivers at rest encryption for files / directories / blocks while providing a versatile range of configuration options to meet a broad array of needs.
Here’s how you might use it:
ALLOW @* * * # Allows ALL processes to access ALL categories
ALLOW @mysql * * # Allows ALL processes to access the @mysql category only
ALLOW @* * /usr/sbin/mysqld # Allows ONE process to access ALL categories
Give the new rules a try and let us know what you think.
Rarely a day goes by that you don't hear about a data breach. Hospital records stolen. Social media accounts hacked. Education transcripts revealed. Every industry is susceptible and every company is at risk. The result can be embarrassing and expensive at best and absolutely crippling at worst, with potential fines, time-consuming lawsuits, and subsequent loss of customer trust.
The steady pace of breaches reinforces the need for encryption as a last line of defense. Recently however, one of the oldest and most effective security tactics has been largely relegated to an afterthought in today's new cloud and big data environments.
This is the result of some common misperceptions about encryption and key management related to cost, performance and ease of use.
Today we set the record straight, breaking down the nine biggest encryptions myths.
Myth 1: Encryption is only for organizations that have compliance requirements. Certainly any company in a regulated industry that mandates data security and privacy should encrypt. That's a no brainer. But a better way to think about encryption is this: if you've got data about your products, customers, employees or market, that you believe is sensitive/competitive, then you should ALWAYS encrypt it, whether there's a legal obligation or not.
Myth 2: SSL encrypts data everywhere.
SSL only encrypts data in motion; it does not cover data at rest. As data is written to disk, whether it's stored for one minute or several years, it should be encrypted.
Myth 3: Encryption is too complicated and requires too many resources.
Data encryption can be as complicated or as easy as you want to make it. The key is to understand the type of data that needs to be encrypted, where it lives and who should have access to it. There are plenty of readily available, easy to use and affordable encryption tools on the market. If application performance is important, look for a transparent data encryption solution that sits beneath the application layer and does not require modifications to your operating system, application, data or storage.
Myth 4: Encryption will kill database performance.
There are a number of factors that impact database performance, and encryption is just one. Application-level encryption tends to pack the greatest performance hit, while the file-level encryption penalty is much lower. For maximum application performance, run block-level encryption on a system utilizing the Intel AES-NI co-processor.
Myth 5: Encryption doesn't make the cloud more secure.
On the contrary, in many cases storing encrypted data in the cloud is oftentimes more secure than keeping it on premises where insiders may have easier access. To ensure the safekeeping of encrypted data in the cloud, make sure you, not your cloud provider, maintain control of the encryption keys. If your provider requires you to hand over your keys, find another cloud service.
Myth 6: Encrypted data is secure data.
Too many organizations fail to effectively manage their encryption keys, either storing them on the same server as the encrypted data or allowing a cloud provider to manage them. Storing the key on the same server as your data or handing them over to your cloud provider is akin to locking your car and leaving the keys in the door. Good key management, with strong policy enforcement makes all the difference.
Myth 7: Key management requires expensive, cloud-adverse hardware.
While this was once true, today there are effective software-based solutions that enable organizations to deploy key management in the cloud or on premises. These solutions can typically be provisioned far faster than hardware security modules (HSMs), are very cloud friendly and meet most compliance statutes.
Myth 8: If your data is encrypted, it can't be stolen.
There is no security solution that will protect your data 100%. In fact, companies should operate with the mindset that their data can and likely will be compromised at some point in time. Data encryption can make the breach aftermath much more palatable though, since encrypted data cannot be decrypted without the key
Myth 9: Encryption is old school. I need a newer security technology to protect big data.
Data encryption is a proven security technique that works very well in modern NoSQL environments. As big data projects move from pilot to production, sensitive data such as protected health information (PHI), financial records, and other forms of personally identifiable information (PII) will likely be captured, processed, analyzed and stored. Encryption is just as integral to securing data in NoSQL as it is in traditional relational database systems.
Firewalls and VPNs can provide some protection against data breaches and theft, but there is no substitute for strong encryption and effective key management, especially in big data and cloud environments. Now that the biggest myths have been busted, there's no longer an excuse not to encrypt.
I'd like to address a recent blog post in CloudTweaks titled, "Cloudera Not Cutting It With Big Data Security." The author makes a number of very salient and valid points about Hadoop security… or lack thereof.
Indeed the Apache Hadoop platform, which includes HDFS and MapReduce and other projects like HBase, Mahout and Hive, was not designed for security. The Hadoop name, for better or worse, is nearly synonymous with big data because it delivers the "three V's" (velocity, variety, volume) at massive scale, enabling organizations to crunch, process, analyze and retain data like never before.
Clearly there are security and compliance implications to big data. Consider the following:
The blog suggests that Cloudera, and I presume other commercial Hadoop vendors, should do more to address the security concerns in Hadoop.
I believe Cloudera absolutely has the right approach to security.
Cloudera has some of the brightest Hadoop and Apache minds in the world. They're experts in enterprise-class systems management. That's what they do. Addressing the big data needs of customers should always remain the company's primary focus.
Cloudera has also cultivated one of the most comprehensive partner ecosystems in the big data market. This is important because it enables Cloudera to focus on its core strengths, while leveraging outside expertise in analytics, BI, cloud computing, and of course, security.
Think about it this way: Would you expect the same company that built your house to install the alarm, mow your lawn and provide Internet service?
Certainly not, so why then would you demand security from a company that specializes in Hadoop?
The right approach is to look for a company with expertise and experience in securing Hadoop platforms.
There's a false narrative that traditional relational databases inherently offer cutting-edge security, but the truth is, security was (and remains) a responsibility of the end user. Encryption, authentication, policy enforcement and other security tools, are all available for CDH and Hadoop, even if they're not provided directly by Cloudera. A customer can work with Cloudera to locate the right vendor for their particular challenge and work collaboratively to build an integrated, secure Hadoop platform.
Data security - particularly Big data security - is quickly becoming a hot topic, as Hadoop-related projects migrate from pilot to production environments. We believe it's incumbent upon big data providers to develop an ecosystem of tightly integrated vendors to complete their security offerings. Cloudera has certainly done that.
Gazzang is proud to be certified Cloudera partner and will continue to provide enterprise-class data security to Hadoop users.
At Gazzang, we have a mantra that borders on religious fanaticism.
“Customers First. Always.”
It’s the reason we can claim deep expertise in securing unique, enterprise-scale big data environments. It’s the reason we know cloud encryption better than anyone else. And it’s the reason no one on our customer support team owns a bed.
Customers also have a significant impact on our product development cycles. A perfect example being today’s exciting Gazzang CloudEncrypt™ announcement.
Gazzang CloudEncrypt was designed to meet specific customer use cases for securing sensitive data at every stage of the Amazon EMR process. This is a very different challenge than encrypting data on a persisted cloud platform like Amazon EC2, which can be done with readily available solutions like Gazzang zNcrypt and zTrustee.
CloudEncrypt offers encryption and key management in ephemeral, burstable Amazon EMR processes. The solution, which you can read about in great detail in this white paper, was developed at the request of a handful of Gazzang customers that had two very clear needs in common:
More detailed customer use cases are covered in the white paper, but the top three we’ve heard thus far are as follows:
Customer feedback is a part of everything we do at Gazzang. The ability to learn from and innovate in response to what we hear from the companies we serve is a badge of honor that we wear proudly.
As always, we welcome your feedback on Gazzang CloudEncrypt, your solution for securing sensitive datasets and outputs on Amazon EMR.
Gazzang is hitting the road again. This time, we're in Atlanta. Home to the 1996 Summer Olympics, The Varsity and this week, MongoDB Atlanta. The annual conference is hosted by 10gen, the company behind MongoDB and a key partner for Gazzang. 10gen is a global leader in big data with an impressive customer list that includes Disney, Intuit, foursquare and CERN.
In advance of MongoDB Atlanta, I spoke with Matt Asay, 10gen's vice president of Corporate Strategy about use cases for MongoDB and why the Gazzang relationship is important for 10gen customers:
Gazzang: What are you and 10gen most excited about this year?
Matt Asay: Over the past few years, NoSQL went from an industry curiosity to a driving force for two of the industry's most important trends: cloud and Big Data. Along the way, MongoDB has established itself as the industry's most popular NoSQL database, with broad adoption by a range of customers. Going into 2013, we're seeing early experiments with MongoDB turn into enterprise-wide deployments for some seriously mission-critical applications. It's awesome to see.
But it's also to see how open source becoming such an integral part of how the enterprise builds and uses software. I'm particularly excited to see open source at the forefront of innovation now, and in 2013 I think we're going to see projects like MongoDB, Hadoop, Android, and the various open-source cloud projects drive huge value for consumers and enterprises alike. It's an exciting time to be involved with open source.
Gazzang: What are some of the unique big data challenges that 10gen helps customers solve?
Matt: While MongoDB is often used to manage large volumes of data, most enterprises actually think of "Big Data" in terms of data velocity and variety, as a recent NewVantage survey highlights. Looking at the results, a mere 28% of enterprises today see volume of data as a primary driver for their Big Data projects, falling to 25% in three years. Instead, a whopping 64% are motivated by the need to analyze streaming data, analyze new data types, or analyze data from diverse sources. That number rises to 68% in three years.
With MongoDB, enterprises:
Some of these involve huge quantities of data, but Big Data's value isnt' necessarily tied to volume. It's more about intelligently using one's data to engage customers or others in ways previously difficult or impossible.
Gazzang: Why do you think the health care industry has been quick to adopt MongoDB?
Matt: Few industries generate as much data as the healthcare, making the need to cost-effectively scale so important, something easily managed with MongoDB. We have also seen healthcare organizations keen to blend structured and unstructured data to improve care, and NoSQL databases like MongoDB are an excellent way to effectively embrace a wide array of data sources. I also think MongoDB's document data store is a great fit for how healthcare organizations want to structure their data.
Gazzang: Why is data security important to your customers?
Matt: Many of our customers are in highly regulated industries like Financial Services and Healthcare. Security for these industries is not only a nice-to-have, it's a firm requirement. As important as it is to be able to scale one's databases, and to accept an array of data sources, it's critical for firms in such industries to ensure customer or other sensitive data is secure. And while we build strong security features into MongoDB itself, we're also very happy to work with security solutions like Gazzang to offer an even higher level of security.
Gazzang: Why is the Gazzang relationship important for 10gen?
Matt: As I mentioned, we take our customers' data security very seriously, and have built in advanced security functionality like Kerberos Authentication help security-conscious customers rest easy. But Gazzang helps us to add an even richer layer of security to MongoDB, something especially important to customers in regulated industries.
Gazzang: What can attendees expect to see and learn at MongoDB ATL?
Matt: MongoDB ATL, like all MongoDB events, is very focused on enabling developers and IT operations to get productive with MongoDB. There are no vendor infomercials, from 10gen or any of our partners. We keep the agenda information-rich as our main concern is making sure more and more companies build exceptional applications with MongoDB.
Health care organizations are moving infrastructure and data to the cloud at a fairly rapid pace. A recent study suggests the cloud computing market in health care is expected to reach $5.4 billion by 2017. Enticing as the cloud is, when dealing with highly sensitive and regulated information, it's important to proceed with caution.
The good news for pharma companies, biotech firms and research hospitals - organizations most likely to move heavy big data payloads to the cloud- is that there are some security best practices that can protect data at rest in the cloud. Check out the Infographic below, or send us an email at firstname.lastname@example.org.
This week, a team of security researchers pulled a list of 126-billion files from public Amazon S3 buckets.
Within a subset of these buckets was – you guessed it – plain text files, many of which contained sensitive information like sales records and employee data. This InfoWorld article does a good job explaining how simple it was to access this data and where the security breakdown occurred.
When it comes to securing data in the cloud, the customer ultimately needs to take responsibility. Network World featured a good dialogue on this very topic earlier in the week.
The cloud can be an incredibly safe place to store sensitive data and run business-critical applications; perhaps even safer than your own data center. But it’s up to the customer to make sure the right security controls are in place. Encrypting your data and maintaining control of your keys is the best place to start.
This security technique ensures your cloud provider or anyone running an unauthorized program or process cannot access the data. It’s also a necessary step toward enabling compliance.
If the aforementioned data in S3 were encrypted, even in a public bucket, the search results would have yielded nothing of value.