16-01-2023
Any organisation that wants to go digital must have a structure in place that supports its needs. Only then will it be able to achieve its objectives, such as sales, profits, customers, orders, etc. This may be one of the epitomes of innovation and digital transformation, but in practice, how do you get started? How to do it?
These are the questions we usually hear when we start a tech project from scratch. In most cases, we resort to Java. More than a programming language (one of the most used of all time and one that is introduced from an early age, at college), it is a technology with different areas of application. In this case, let’s introduce Spring for web development and Apache Spark as a solution for processing large volumes of data.
Which framework to use to develop Java applications? The answer is Spring!
Due to its various functionalities and components, Spring has already been the subject of one of Xtech Community’s blog posts The technology community proves that this is, in fact, the best framework to develop Java applications for several reasons:
- Popularity: is used by over 50% of the large Java developer community; has open-source licensing.
- Reliability: commercial support is available (important for companies and consultants); it has existed since 2004 and has received several updates since then.
- Productivity: is practical, effective, simple, modular and well documented; it is also deployable/testable/portable, which means it can be running on other frameworks like JBoss or Docker containers – all you need is the executable and to be able to use and start the application.
We introduced Spring as a framework with several components and these are the 3 most used:
- Spring Boot: for developing Java applications in a very easy and fast way
- Spring Data: to be able to access different types of data repositories (such as SQL, NoSQL, Rest, etc.) and in a very unified way
- Spring Cloud: At a time when the various structures with an overload of applications need to move to modern architectures (such as microservices), this Spring component allows the various tools to interact with each other
- Microservices architectures are the most widely used in web development today. Before, the most implemented architectures were monolithic, i.e. based on a single application, leading to several conflicts in the transition to production. Now, a microservices architecture needs to be scalable, available, fast and autonomous
In other words, Spring makes it easy to create several microservices so that the various tools of an application work and interact.
What is the solution for managing large volumes of data? Big Data
There are several organisations that, due to their success or industry, will collect large amounts of data and information that needs to be processed. This is one of the functions of Big Data. More than a technological unit, it is “a buzzword that contemplates several concepts”:
- Volume: if there are large volumes of data, they must be quantified and processed.
- Variety: data comes from various sources, such as databases, file systems or legacy systems.
- Speed: there must be the ability to compute data quickly and in a timely manner.
- Verifiability: knowledge of data types and which to compute.
- Value: question what data to compute to bring value and purpose to customers.
Therefore, Facebook generated 4 petabytes of data every day or Google processed over 20 petabytes per day (2020 data). It is only normal for Big Data to push to implement robust and scalable systems to scale this huge amount of data. But how? These are challenges that have 3 solutions:
- Hadoop: Widely used in data processing. Framework that enables distributed computing of information. It must use a distributed file system.
- Apache Spark: Like Hadoop but with one major peculiarity – it computes data in memory before storing it on disk. Requires the use of clusters configured with machines with lots of RAM.
- Apache Kafka: Enables streaming event computing where it is possible to process data in near real time.
All these frameworks used in Big Data have horizontally scalable computing, where you just add more machines to run new processes and can parallelise computing. In practice, these frameworks allow us to add robustness to an architecture, being able to support a large load of sources that can communicate from various applications.
Leave a comment
Comments are closed.