Effective Solutions Through Partnership

Category Archives: Data Management

Sacramento ARMA Records Knowledge Conference Event Recap

Conferences, Cyber Security, Data Management, Government, Information Security, Information Technology, Innovation in the Public Sector, IT Modernization, IT Security, KAI Partners, Public Sector, Ransomware, Risk Assessment, Sacramento, Technology

By Jamal Hartenstein, JD, CISSP, CGEIT, PMP

The Greater Sacramento Capitol Chapter of ARMA recently held its annual Records Knowledge Conference, which brought together records managers from city, county, and state clerk offices.

According to our local ARMA chapter, ARMA is dedicated to providing education and resources to those in the Records Management and Information Governance fields. They are committed to enhancing Records Management and Information Governance professionals through training, networking, leadership, and outreach.

The conference attendees brought a sense of eagerness to learn and share—ARMA chapter leadership gave event attendees a special opportunity to hear from world-class speakers—including and a lead researcher on the IBM Watson project, Dr. Ashish Kundu—on some of the most important and cutting-edge topics.

Along with a formidable group CEOs, I was honored to be asked to speak about Cybersecurity Threats to Information Governance. Highlights of the event and major takeaways included:

  • Understanding what data you have, who accesses it, and where it goes is paramount.
  • Conflicts among document retention policies, industry best practices, and laws suggest that we seek out and use the highest common denominator.
  • Trending topics and buzzwords the government sector include players like Smart Communities, Artificial Intelligence (AI), Digital ID, Blockchain, NIST, and the KAI Partners approach to security assessments.
  • Data Migrations are underway. Records Managers who respond to Freedom of Information Act (FOIA) requests for public records or subpoena must deliver records formats adhering to general business practices, which may be legacy.
  • Regarding Third Party Risk Management (TPRM), cloud services, and Business Associate Agreements, liability points back to the data controller regardless of contracts with data processors or third parties.
  • Mobile device management and data/device ownership remain a point of contention and confusion during public record requests.
  • Innovation is forcing a cultural shift in workforce demands and understandings of emerging technologies.
  • Artificial Intelligence (AI) solutions can be used to categorize and classify data, performing some of the tasks of current Data Custodians and Data Owners.
  • While AI may not replace Records Managers, Records Managers who understand and embrace AI will inevitably replace those who do not.

Public sector IT innovation and modernization means systems and processes change rapidly. One example of this is California Assembly Bill 2658, recently signed into law by the governor. This new law updates the definition of an Electronic Record to include blockchain and smart contracts as legally recognized records. It sends a clear signal that digital records management, particularly blockchain technology and smart contracts, are priorities for a more innovative and dynamic public sector.

This new law impacts public records requests because entries logged in public agency-owned private blockchains are electronic records. These records are susceptible to the Freedom of Information Act (FOIA). Records Managers may benefit from technology that makes the identification and delivery of public records to requestors easier. It may also create convenience for those exercising Public Records Act (PRA) requests. It’s a double-edged sword; it streamlines the processes but increases PRA volume at the same time.

The discussion of the California blockchain law was one most important topics discussed at the ARMA event. Another popular topic was IT Security Assessments.

The urgency in public sector data governance and records management is an incredible opportunity to embed IT security controls for the public sector personnel working at the heart of the ever-expanding challenges.

KAI Partners performs security assessments to address the multitude of challenges facing the public sector. Our assessments help ensure secure and efficient delivery systems where the organizational objectives align with the development of strategic plans and programs. In addition, KAI Partners’ training division—KAIP Academy—works to address technical skills gaps. Our training courses include ITIL, Project Management, Agile/Scrum, and more.

Were you at the ARMA Conference? What were your biggest takeaways about public sector innovation?

About the Author: IT Security Program Manager at KAI Partners, Jamal Hartenstein is a cybersecurity legal expert who has helped some of the country’s largest financial institutions, healthcare companies, and federal agencies develop their IT Security Roadmap programs. In his current role, Jamal provides guidance to executive staff and security professionals on laws, frameworks, and policies that help shape their strategic plan, and helps organizations innovate safely and securely. Prior to working for KAI Partners, Jamal served as an Electronic Warfare Sergeant in the U.S. Army Military Intelligence Corps, where he was a steward for Defense Information Systems Agency (DISA) framework. He earned his undergraduate degree from Georgia Military College and his Juris Doctorate from University of the Pacific, McGeorge School of Law in California.

Big Data and Hadoop: A high-level overview for the layperson

Big Data, Data Management, Information Technology, Internet of Things, Sacramento, Technology

By Sid Richardson, PMP, CSM

I have been in the data warehousing practice since 1994, when I implemented a successful Distributed Data Warehouse for a flagship banking product, followed by co-developing Oracle’s Data Warehouse Methodology. In August 1997, I was invited to speak at the Data Warehouse Institute Conference in Boston.

Over the years, I’ve researched and implemented what I would consider some small scale/junior Big Data systems. I have an interest in Big Data and wanted to share my learnings on Big Data and Hadoop as a high-level overview for the layperson / busy executive.

What is Big Data?

Big Data defines an IT approach used to process the enormous amounts of available information from social media, emails, log files, text, camera/video, sensors, website clickstreams, Radio Frequency Identification (RFID) tags, audio, and other sources of information in combination with existing computer files and database data.

In the 1990s, three major trends occurred to make up Big Data: “Big” Transaction Data, “Big” Interaction Data, and “Big” Data Processing.

In 2001, Big Data was defined by Doug Laney, former Vice President and Distinguished Analyst with the Gartner Chief Data Officer (CDO) research and advisory team. Mr. Laney defined Big Data by the “three Vs”:

    1. Velocity – Speed of incoming data feeds.
    2. Variety – Unstructured data, social media, documents, images.
    3. Volume – Large quantities of data.

IBM decided to add two more Vs:

    1. Veracity – Accuracy of the data.
    2. Value – To define Big Data.

Why do we need Big Data?

In a nutshell: We need Big Data because there is a lot of data to process, for example:

Also noted by The Economist, the abundance of data and tools to capture, process, and share all this information already exceeds the available storage space (and the number of eyes on the planet to review and analyze it all!)

According to Forbes’s 2018 article, “How Much Data Do We Create Every Day? The Mind-Blowing Stats Everyone Should Read,” there are 2.5 quintillion bytes of data created each day. And, over the last two years alone, 90 percent of the data in the world was generated.

Clearly, the creation of data is expanding at an astonishing pace—from the amount of data being produced to the way in which it’s re-structured for analysis and used. This trend presents enormous challenges, but it also presents incredible opportunities.

You’re probably thinking, alright, I get the big data thing, but why couldn’t data warehouses perform this role? Well, data warehouses are large, complex, and expensive projects that typically run approximately 12-18 month-long durations with high failure rates (The failure rate of data warehouses across all industries is high—Gartner once estimated that as many as 50 percent of data warehouse projects would have only limited acceptance or fail entirely).

A new approach to handle Big Data was born: Hadoop.

What is Hadoop?

In a nutshell, Hadoop is a Java-based framework governed by the Apache Software Foundation (ASF) that initially addressed the ‘Volume’ and ‘Variety’ aspects of Big Data and provided a distributed, fault-tolerant, batched data processing environment (one record at a time, but designed to scale to Petabyte-sized file processing).

Hadoop was created out of a need to substantially reduce the cost of storage for massive volumes of data for analysis and does so by emulating a distributed parallel processing environment by networking many cheap, existing commodity processors and storage together, rather than using dedicated hardware and storage solutions.

Why Hadoop?

The Challenges with Hadoop

There is a limited understanding about Hadoop across the IT industry. Hadoop has operational limitations and performance challenges—you need to resort to several extended components to make it work and to make it reliable. And, Hadoop is becoming more fragmented, pulled by different commercial players trying to leverage their own solutions.

In summary…

The Hadoop Framework addresses a number of previous challenges facing the processing of Big Data for analysis. The explosion in deployment of data capture devices across all industries world-wide necessitated a more cost-effective way to store and access the massive volumes of data accumulating by the second!

I hope this blog post has provided you with a better understanding of some key Big Data and Hadoop concepts and technologies. Have you worked with Big Data and/or Hadoop? Let us know your thoughts and experiences in the comments!

P.S. If you have gotten this far and are curious where the name Hadoop comes from, here you go! The name ‘Hadoop’ was coined by one of the sons of Doug Cutting, a software designer and advocate and creator of open-source search technology. Mr. Cutting’s son gave the name ‘Hadoop’ to his toy elephant and Mr. Cutting used the name for his open source project because it was easy to pronounce.

About the Author: Mr. Richardson’s passion is Data Warehousing, Business Intelligence, Master Data Management and Data Architectures. He has helped Fortune 500 companies in the US, Europe, Canada, and Australia lead large-scale corporate system and data initiatives and teams to success. His experience spans 30 years in the Information Technology space, specifically with experience in data warehousing, business intelligence, information management, data migrations, converged infrastructures and recently Big Data. Mr. Richardson’s industry experience includes: Finance and Banking, government, utilities, insurance, retail, manufacturing, telecommunications, healthcare, large-scale engineering and transportation sectors.

KAI Partners Staff Profile: The Data Architect

Business Analysis, Data Architect, Data Management, Data Science, Government, IT Modernization, KAI Partners, KAI Partners Staff Profile, Learning, Project Management, Sacramento, Technology, Training

There are many paths to success and while not everyone takes the same path, we often manage to arrive at the same destination. In our KAI Partners Staff Profile series, we share interviews and insight from some of our own employees here at KAI Partners. Our staff brings a diversity in education, professional, and life experience, all of which demonstrate that the traditional route is not necessarily the one that must be traveled in order to achieve success.

Today, we bring you the journey of Ajay Bhat, Senior Data Architect, KAI Partners Inc. who works as Enterprise Data Architect for one of KAI Partners’ public sector clients. His role involves managing different Data Management activities and architecting solutions to meet the client’s needs.

KAI Partners, Inc.: How did you get into your line of work?

Ajay Bhat: My first job as GET (Graduate Engineer Trainee) was assisting in doing Business Process Reengineering and helping implement Enterprise Resource Planning (ERP). Though a Mechanical Engineer by background, my first job introduced me to various IT tools used for ERP implementation.  Over a period of time, I got trained in different ERP softwares.

KAI: Are there any certifications or trainings you’ve gone through that have helped in your career?

AB: Staying up with technology is something that I have always liked. I have completed certifications in Oracle, JAVA, and SAS. I did some self-learning courses in Big Data technologies and Data Science. I also went back to school to get my MBA in Business Intelligence from University of Colorado, Denver.

KAI: What is your favorite part about your line of work and why?

AB: Problem solving is my favorite part of my job. When I go to work, there is always an issue to resolve that involves some aspect of critical thinking. Using technology to implement solutions is another thing I like about my job.

KAI: What is one of the most common question you receive from clients and what counsel or advice do you give them?

AB: Depending on the project, the questions may vary, but most frequently I am asked how I am able to switch the roles on a project so fast. One day I may be a Database programmer, DBA another day, data Modeler, BI guy, or Data Architect some other day. Switching between roles is what I do frequently. My answer to this is that any role is a series of small logical steps. It may seem quite overwhelming from a distance, but if we break it down into a series of logical steps, it is doable. This directly applies to any problem solving I do in my day-to-day life as well.

Now that we’ve learned more about Ajay’s data architecture work, here’s a little more about him!

Quick Q&A with Ajay:

Daily, must-visit website(s):
https://github.com/
https://www.kdnuggets.com/datasets/index.html
https://www.kaggle.com/
https://slack.com/

Preferred genre of music or podcast to listen to: Classic jazz, Bollywood music

Best professional advice received: At the end of day it is just another day at work, do your best.

Book you can read over and over again: Autobiography of a Yogi by Paramahansa Yogananda

Most-recent binge-watched show: I don’t binge watch now, but did binge “24” a while ago

 About Ajay: Ajay currently supports a public sector client doing Data Management. Besides work, he loves outdoor activities, racquetball, running and a game of chess. He also practices meditation regularly.

What the KAI Partners Team is Thankful for in 2017

Communications, Data Management, Employee Engagement, General Life/Work, KAI Partners, Organizational Change Management (OCM), Project Management, Project Management Professional (PMP), Prosci, Sacramento, SAHRA—The Sacramento Area Human Resources Association, SHRM, Small Business, Team Building, Training


From the KAI Partners team to yours, we wish you a happy, healthy, and stress-free Thanksgiving holiday.

Planning for Test Data Preparation as a Best Practice

Best Practices, Data Management, Project Management, Systems Development Life Cycle (SDLC), Testing

By Paula Grose

After working in and managing testing efforts on and off for the past 18 years, I have identified a best practice that I use in my testing projects and I recommend it as a benefit to other testing projects, as well.

This best practice is test data preparation, which is the process of preparing the data to correlate to a particular test condition.

Oftentimes, preparing data for testing is a big effort that people underestimate and overlook. When you test the components of a new system, it’s not as simple as just identifying your test conditions and then executing the test—there are certain factors you should take into account as you prepare your test environment. This includes what existing processes, if any, are in place to allow for the identification or creation of test data that will match to a test condition.

A test case may consist of multiple test conditions. For each test condition, you must determine all the test data needs. This includes:

  • Input data
  • Reference data
  • Data needed from other systems to ensure synchronization between systems
  • Data needed to ensure each test will achieve its expected result

Planning for test data preparation can greatly reduce the time required to prepare the data. At the overall planning stage for testing, there are many assessments that should be conducted, including:

  • Type of testing that will be required
  • What testing tools are already available
  • Which testing tools may need to be acquired

If, at this point, there are no existing processes that allow for easy selection and manipulation of data, you should seek to put those processes in place. Most organizations have a data guru who is capable of putting processes in place for this effort—or at least can assist with the development of these processes.

The goal is to provide a mechanism that will allow the selection of data based on defined criteria. After you do this, you can perform an evaluation as to whether the existing data meets the need—or identify any changes that must be made. If changes are required, the process must facilitate these changes and provide for the loading/reloading of data once changes are made.

One word of caution concerning changing existing data: You must be certain that the existing data is not set up for another purpose. Otherwise, you may be stepping on someone else’s test condition and cause their tests to fail. If you don’t know for sure, it is always better to make a copy of the data before any changes are made.

About the Author: Paula Grose worked for the State of California for 33 years, beginning her work in IT as a Data Processing Technician and over time, performing all aspects of the Systems Development Life Cycle. I started in executing a nightly production process and progressed from there. As a consultant, Paula has performed IV&V and IPOC duties focusing on business processes, testing, interfaces, and data conversion. She currently leads the Data Management Team for one of KAI Partners’ government sector clients. In her spare time, she is an avid golfer and enjoys spending time with friends, and playing cards and games.

next page »