Blog Feed

Translation between Spring Boot and FastAPI

TaskSpring BootFastAPI
Generate project structurehttps://start.spring.io/https://fastapi.tiangolo.com/project-generation/
Define API routes@RequestMappingAPIRouter
Deal with database (ORM & Querying)HibernateSQLAlchemy
Entity to Data Transfer Object (DTO)ModelMapperPydantic.BaseModel
(with orm_mode = True)
Database MigrationLiquibase, FlywayAlembic
Dependency ManagementMaven / GraddlePip
HTTP Request InterceptorAspectJ, ServletFilterDepends, Middleware
Built-in AuthenticationSpring SecurityFastAPI.security
Read Environment file @Value(“${key}”)
Files .properties
Pydantic.BaseSetting
File .env
OpenAPIspringdoc-openapiIt has a built-in module. Access at localhost:8000/docs

A thought about language

It is certainly that there is a lot of programming languages on the market. Each language creates itself job positions and even programming “religion” where engineers prefer using a certain language over others, and thanks to that, the software development market can thrive because the diversity in language increase the scarcity in workforce.

So why some language offer higher salary than others ?

Well, language is tool, each tool is more appropriate for a certain purpose. Salary is the compensation from the budget assigned for a purpose. So actually there is no standard market price assigned to each language. It still is the supply-demand game. The difference is the demand is influenced by big boys in the industry. Big boys create programing languages, frameworks that overcome limitations of the ones exists before. The limitation can be speed, memory efficiency, friendly syntax, built-in solution for repeatable tasks. As the result, it can save money in renting physical resource (computers) and save working hours (constructing and error fixing) and that savings also contribute to the compensation, or the salary.

Can an engineer learn multiple language ?

Absolutely Yes.

Learning a language is always hard and time-consuming. There is a lot to remember: syntax. And syntax can be translated from one to another by the same principle as human language. So far, for each language, what we need to remember are:

  • How to declare variable
  • How to write conditional check (if-else)
  • How to do the loop (for-while)
  • How to separate code in reusable blocks (class, method, function, interface)
  • How to handle concurrency (process, thread, event loop)
  • What libraries are offered out-of-the-box per each language.

The tactic is, for the first language, learn it carefully and practice a lot. For new languages, compare it to the one we knew, and find the translation between syntax sets, then practice a lot.

Another tip is to search for open-source projects, and read their code. This is also a quickest way to learn from top engineers.

Is it worthy to know more than one language?

Yes, I think.

If we only need a stable job, one language is enough. But remember that technology changes everyday, and any project has the starting date and the end date. So, ensure your adaptability.

If we are interested in technology, learning new languages can provide us a deeper understanding about computer as well as to see more than 1 way to solve same issues. It keeps us open-minded, curios and provide a broad range of knowledge, which is a truly stability, so far as what I learn from many professionals.

And, the whole programing activities is a career, not a single language itself.

Is it tired to know more than one language?

Yes, obviously. No need to explain.

First time to NLP huh ?

Natural Language Processing (NLP) is a major research field of AI and to almost developers, it sounds like a miracle. Lately I have an interest in this field since the noticeable viral news of GPT-3 model. I decided to learn to make use of it as a tool before somehow it will replace developer job in the future as many predictions from many illustrious figures. But the more I study about it, the more nothing I know. There are too many background knowledge to know before understanding each word on the GPT-3 paper. Below is a quick summary about works behind the scene that hopefully useful to developers like me who wants make a leap to catch up with the AI progress.

List of keywords

It is inevitable long and exhausting journey to make sure we can understand fairly basic about below terms:

  • Convolutional Neuron Network, Recurrent Neuron Network, Activation Function, Loss Function, Back Propagation, Feed Forward.
  • Word Embedding, Contextual Word Embedding, Positional Encoding.
  • Long – Short Term Memory (LSTM).
  • Attention Mechanism.
  • Encoder – Decoder Architecture.
  • Language Model.
  • Transformer Architecture.
  • Pre-trained Model, Masked Language Modeling, Next Sentence Prediction.
  • Zero-shot learning. One-shot learning, Few-shot learning.
  • Knowledge Graph.
  • BERT, GPT, BART, T5

What exists before BERT and GPT ?

There was a lot of researches and works existed in NLP field. Work on NLP field means to solve below common Tasks:

  • Tagging Part of Speech.
  • Recognising Named Entities.
  • Sentiment Classification.
  • Question & Answering.
  • Text Generation.
  • Machine Translation.
  • Summarization.
  • Similarity Matching.

SpaCy and NLTK is two most famous libraries in NLP field that provide tools, frameworks and models solving a few Tasks above, but not everything. Each Task usually had its own model and there is no reusing or transferring between models, until the Transformer Architecture is published. With its amazing performance and ability of Transformer Architecture, researchers begin to think about using this architecture to perform above NLP tasks, to have one single model can do it all. And the result is the BERT and GPT models which both are using Transformer. A fact is that, BERT is powering the Google search engine, and GPT-3 is the one powering ChatGPT application. There are also more applications making used of these models can be found around Internet.

Some Core Challenges when doing NLP

No matter what method is applied, the challenges that forming the NLP field is still the same:

  • Computer does not understand words, it understands numbers. Find a method to convert each word in a sentence into a vector (a group of numbers) that: given 2 words with similar meanings, 2 vectors can have a close-distance to present the similarity.
  • Given a sentence with many words and variable length, find a vector can present the sentence.
  • Given a passage with many sentences and variable length, find a vector can present the whole passage.
  • From a vector of a word, sentence or passage, find a method to convert it back to words/sentences/passage. This task in turn become the Machine Translation, or Text Summarization.
  • From a vector of a word, sentence or passage, find a method to classify it into some senses/intents. This task in turn become Sentiment Classification.
  • From a vector of a word, sentence or passage, find a method to calculate the similarity to other vector. This task in turn become Question & Answering, or Text Generation, Text Suggestion.

It will be too long to dive into each keyword here so please hit Follow button to receive upcoming posts from my learning journey.

Thanks for reading!

Why is 3 a magic number !

I know this feel subjective but from my observation on programing life and daily life, the number 3 appears everywhere.

  • [Java, OOP] The best practice says that, the depth of class inheritance should be 2 or 3, but not more than that
  • [Javascript, callback] The best practice says that, Callback hell refers to the situation where callbacks are nested within other callbacks several levels deep. From 3 level depth, the source code is potentially difficult to understand.
  • [UI UX] Well, that’s a good goal for a UX designer to achieve. The 3-Click Rule states that users should be able to perform a task in 3 clicks.
  • [System design] Most of system is 3-layer, or 3-tier design, such as MVC, MVP
  • [3D design] 3D models are formed from triangles.
  • [Writing] Power phrases are formed from 3 consecutive nouns, verbs, or adjectives
  • [Time] The past, the present, the future
  • [Religion] Faith and Hope and Charity
  • [Mind] The heart, the brain, the body
  • [Government] Separation of Powers is the concept whereby power must be divided into 3: the legislative, executive and judicial branches
  • [Dating] It usually takes three dates for you to know if this person is good enough for you to keep them in your life
  • [Survival] You can survive three minutes without breathable air (unconsciousness), or in icy water. You can survive three hours in a harsh environment (extreme heat or cold). You can survive three days without drinkable water. You can survive three weeks without food.
  • [Lucky] “Third time lucky”.
  • [Competition] A stable competitive market never has more than three significant competitors.
  • [Photograph] A composition guideline: places your subject in the left or right third of an image, leaving the other two thirds more open. This create compelling photos.
  • [Music] The idea is to have only three musical phrases playing at any one time in your song. Going beyond three or four elements at once can crowd your track, making it harder for your audience to connect and recall your composition.
  • [Music] Each chord is formed from 3 notes.
  • [Music] Basic rock/pop song structure generally has three unique parts: The verse, the chorus, and the bridge.
  • [Golden Circle] Why, How, What
  • [Teaching/Learning] Give students the opportunity to learn something at least three times before they are expected to know it and apply it.

And, there is a rule with name :”Rule of Three”.

Docker cheat sheet

TaskCommand
Get Bash shell in a container docker exec -it <container name> /bin/bash 
Clear no-tags images
(or dangling images)
docker rmi $(docker images --filter "dangling=true" -q --no-trunc)
Get docker instance’s IPdocker inspect --format='{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' $INSTANCE_ID
Dump Postgres database to .sql filedocker exec -t your-db-container pg_dumpall -c -U postgres > dump_`date +%d-%m-%Y"_"%H_%M_%S`.sql

-U postgres : database username
Import .sql files to Postgres dbcat your_dump.sql | docker exec -i your-db-container psql -U postgres
Copy file/folder from local to dockerdocker cp foo.txt container_id:/foo.txt
docker cp src/. container_id:/target
Copy file/folder from docker to localdocker cp container_id:/foo.txt foo.txt
docker cp container_id:/src/. target
Somehow Docker lost access to Internetsudo service docker restart
systemctl restart docker
Docker cheat sheet

Java 8 Stream Cheatsheet

Convert Array To Map

# Example we have class
class Employment {
 String name;
}

# Usages
List<Employment> employmentList = ....;
Map<String, Employment> employments = employmentList.stream().collect(Collectors.toMap(Employment::getName, Function.identity()));

Merge Lists without duplications

//Example we have class
class CustomerLabel {
 Long id
}
// to merge 2 list
List<CustomerLabel> listA = ...;
List<CustomerLabel> listB = ...;
List<CustomerLabel> merged = new ArrayList<>(
                Stream.of(listA, listB)
                        .flatMap(List::stream)
                      .collect(Collectors.toMap(CustomerLabel::getId,
                                d -> d,
                                (CustomerLabel x, CustomerLabel y) -> x == null ? y : x))
                        .values());

Sum of a List

# Example we have class
class Sale {
 Double totalDollar;
}
# Usages
List<Sale> sales = ....
Double sum = sales.stream().map(Sale::getTotalDollars).reduce(0d, Double::sum);

Calculate Average value

Double double = sales.stream()
    .mapToDouble(WeightChange::getValue)
    .average()
    .orElse(Double.NaN);

Find Max Number in a List

List<Double> numbers = ...
Double max = numbers.stream().max(Double::compareTo).orElse(null);

Find Min Number in a List

List numbers = ...
Double min =numbers.stream().min(Double::compareTo).orElse(null);

Sort an Array of Object

As default, sort is ASC

List<Sale> sales = ...
List<User> sortedList = sales.stream().sort(Comparator.comparingDouble(Sale::getTotalDollar)).collect(Collectors.toList())

Find an element in a List

List<Sale> sales = ...

# find first Sale that have totalDollar > 100
Sale firstElement = sales.stream().filter((sale) -> sale.getTotalDollar() > 100).findFirst().orElse(null)

# find any Sale 
Sale firstElement = sales.stream().filter((sale) -> sale.getTotalDollar() > 100).findAny().orElse(null)

Different between findFirst() vs findAny()

findAnyfindFirst
return any element satisfying the filter, usually the first element in single-thread modealways return the first element satisfying the filter
in parallel mode, it is not guarantee return the first elementin parallel mode, it ensures to return the first element in the list

Parallel mode

Replace .stream() –> .parallelStream()

List<Integer> listOfNumbers = Arrays.asList(1, 2, 3, 4);
listOfNumbers.parallelStream()...

Note:

  • The number of threads in the common pool is equal to the number of processor cores.
  • To change thread pool size, add flag when start application:
    -D java.util.concurrent.ForkJoinPool.common.parallelism=4
  • Sometimes the overhead of managing threads, sources and results is a more expensive operation than doing the actual work.
  • arrays can split cheaply and evenly, while LinkedList has none of these properties. 
  • TreeMap and HashSet split better than LinkedList but not as well as arrays.
  • The merge operation is really cheap for some operations, such as reduction and addition
  • merge operations like grouping to sets or maps can be quite expensive.
  •  As the number of computations increases, the data size required to get a performance boost from parallelism decreases.
  • parallel streams cannot be considered as a magical performance booster. So, sequential streams should still be used as default during development.

Regex Cheatsheet

CasesRegex (Java)
Valid Phone formatWith Parenthese:
^((\(\d{3}\))|\d{3})[- .]?\d{3}[- .]?\d{4}$
Example: (988) 989-8899

With International Prefix:
^(\+\d{1,3}( )?)?((\(\d{3}\))|\d{3})[- .]?\d{3}[- .]?\d{4}$
Example: +111 (202) 555-0125

10 digits:
^(\d{3}[- .]?){2}\d{4}$
Example: 989 999 6789
Valid Email format Simplest:
^(.+)@(.+)$

RFC 5322:
^[a-zA-Z0-9_!#$%&’*+\/=?`{|}~^.-]+@[a-zA-Z0-9.-]+$

No leading, trailing, consecutive dots:
^[a-zA-Z0-9_!#$%&’+\/=?{|}~^-]+(?:\.[a-zA-Z0-9_!#$%&’*+\/=?{|}~^-]+)@[a-zA-Z0-9-]+(?:.[a-zA-Z0-9-]+)*$
Valid Money format^(?:(?![,0-9]{14})\d{1,3}(?:,\d{3})*(?:\.\d{1,2})?|(?![.0-9]{14})\d{1,3}(?:\.\d{3})*(?:\,\d{1,2})?)$
Example: 123,234,432.43 , 123.234.432,43
Valid ISO Date time format(\d{4}-\d{2}-\d{2})[A-Z]+(\d{2}:\d{2}:\d{2}).([0-9+-:]+)
Example: 2021-10-12T23:59:00+07:00
Valid URL^(https?|ftp|file):\/\/[-a-zA-Z0-9+&@#\/%?=~_|!:,.;]*[-a-zA-Z0-9+&@#\/%=~_|]
Example: https://www.the-tech-lead.com
Strong password^(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[@#$%^&+=])(?=\S+$).{8,}$

– A digit must occur at least once
– A lower case letter must occur at least once
– An upper case letter must occur at least once
– A special character must occur at least once
– No whitespace allowed in the entire string
– At least eight places
Valid IP v4^(([0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(\.(?!$)|$)){4}$
Example: 192.168.0.1
Valid IP v6Standard format:
^(?:[0-9a-fA-F]{1,4}:){7}[0-9a-fA-F]{1,4}$
Example: 2001:0db8:85a3:0000:0000:8a2e:0370:7334

Hex Compressed format:
^((?:[0-9A-Fa-f]{1,4}(?::[0-9A-Fa-f]{1,4})*)?)::((?:[0-9A-Fa-f]{1,4}(?::[0-9A-Fa-f]{1,4})*)?)$
Example: 2001:0db8:85a3:0000::8a2e:0370:7334
Valid Domain name^(?!-)[A-Za-z0-9-]+([\-\.]{1}[a-z0-9]+)*\.[A-Za-z]{2,6}$
Example: the-tech-lead.com
Valid Credit Card Number^(?:4[0-9]{12}(?:[0-9]{3})?|[25][1-7][0-9]{14}|6(?:011|5[0-9][0-9])[0-9]{12}|3[47][0-9]{13}|3(?:0[0-5]|[68][0-9])[0-9]{11}|(?:2131|1800|35\d{3})\d{11})$

for Visa, MasterCard, American Express, Diners Club, Discover, and JCB cards.
Camel Case ((?:[A-Z]+[a-z]\s)[A-Z]+[a-z]*)

Example: TX Simply Premier Checking
Frequently used regular expressions

Note:

  • when use String regex = "copied regex", don’t forget to add extra "\" , so "\d" should become "\\d" , "\." become "\\.", etc
  • Replace ^ and $ as .* to have the “containing” matching. Example to match a string containing texts look like money :
    Boolean match = someString.match(".*(?:(?![,0-9]{14})\\d{1,3}(?:,\\d{3})*(?:\\.\\d{1,2})?|(?![.0-9]{14})\\d{1,3}(?:\\.\\d{3})*(?:\\,\\d{1,2})?).*")

Server Setup Cheatsheet

Users & Groups

Create User with password
useradd -m <username>
passwd <username>

Create a Group
groupadd <group name>

Add user to group
usermod -aG <group name> <username>

Remove user from group
deluser <username> <group name>

Set ACL to allow a user to read folders
setfacl -m u:<username>:rwx,d:u:<username>:r <folder path>

SSH

Connecting
ssh <username>@<host IP or domain>
ssh -i <path to id_rsa file> <username>@<host IP or domain>

Generate SSH key
ssh-keygen

Add SSH public key to remote server
Manually paste public keys to: ~/.ssh/authorized_keys
Or: ssh-copy-id <username>@<ssh_host>
Note: Before ssh-copy-id, remote server must already create the underlying user. ssh-copy-id will prompt for password to login

Download files/folder via SSH
scp [-r] <username>@<remote server>:<path on remote server> <path on local>

Upload files via SSH
scp [-r] <path on local> <username>@<remote server>:<path on remote server>

Configure SSH timeout

vi /etc/ssh/sshd_config

# Hit "i" for INSERT mode on vi, edit below line
ClientAliveInterval  1200 # 1200 seconds

# Hit Esc to escape INSERT mode, type ":x" to save file
# Restart sshd
sudo systemctl reload sshd

Firewall

List all Rules of all Chains:
iptables -n -L -v --line-numbers

List all Rules of a specific Chain
iptables -L INPUT --line-numbers

Delete a Rule in a Chain at a line number
iptables -D INPUT 10

Allow Incoming Traffic , Insert Rule add specific line
iptables -I INPUT <line_number> -p tcp --dport 80 -s <source_ip> -j ACCEPT

Allow Outgoing Traffic, Append Rule add end of a Chain
iptables -A OUTPUT -d <destination_ip> --sport <source port> -j ACCEPT

[NAT] Allow LAN nodes to access public network via interface eth0
iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE

[NAT] Redirect Incoming traffic to internal node
iptables -t nat -A PREROUTING -i eth0 -p tcp --dport 80 -j DNAT --to 172.31.0.23:80

Parameters:
-p : tcp | udp | icmp | all
-j : ACCEPT | DROP | QUEUE | RETURN

Run a script when startup

sudo vim /etc/rc.local

Edit the rc.local file with your desired commands like below :

#!/bin/sh
# add your commands here
# last line must be exit 0 
exit 0

Then activate it by:

sudo chmod -v +x /etc/rc.local
sudo systemctl enable rc-local.service

Monit

Let the server notify you when something goes wrong !

Origin: https://mmonit.com/monit/documentation/monit.html

Install
apt-get install monit -y

Start as a daemon once per n seconds
monit -d 30

Configuration file
~/.monitrc or /etc/monitrc

Specify configuration file :
monit -c <path to cf file>

Configuration file sample content

Open Httpd for Dashboard

set httpd port 2812 allow username:password
# with IP
set httpd
     port 2812
     use address 127.0.0.1
     allow username:password
# using htpasswd file with limited username
set httpd port 2812
      allow md5 /etc/httpd/htpasswd john paul ringo george

Configure Daemon

SET DAEMON <seconds>

Setup Alert methods via Email

set alert dev@yourcompanny.com
set mail-format {
      from: Monit Support <monit@foo.bar>
  reply-to: support@domain.com
   subject: $SERVICE $EVENT at $DATE
   message: Monit $ACTION $SERVICE at $DATE on $HOST: $DESCRIPTION.
            Yours sincerely,
            monit
 }
SET MAILSERVER
        <hostname|ip-address>
        [PORT number]
        [USERNAME string] [PASSWORD string]
        [using SSL [with options {...}]
        [CERTIFICATE CHECKSUM [MD5|SHA1] <hash>],

Setup Alert via Slack Webhook

  • Go to https://<yourteam>.slack.com/apps/manage/custom-integrations
  • Click Incoming WebHooks
  • Click Add Configuration
  • Select an existing channel or create a new one (e.g. #monit) – you can change it later
  • Click Add Incoming WebHooks integration
  • Copy the Webhook URL
  • Create file slack_notification.sh : touch /etc/slack_notification.sh .
  • Sample script :
#!/bin/bash
SERVER_NAME=$1
SERVICE=$2
STATUS=$3
TEST_WHEN="`date +%Y-%m-%d_%H:%M:%S`"
PRETEXT="$SERVER_NAME | $TEST_WHEN"
TEXT="$SERVICE - $STATUS"
SLACK_HOOK="Paste your Copied Slack web hook here"

curl -X POST --data-urlencode "payload={\"text\": \"$TEXT\"}" $SLACK_HOOK
  • Use in Configuration file
check program check-mysql ...
     if status != 0 then exec "/etc/slack_notification.sh ServerName ServiceName OK"

Monitor ports with Alert via Slack Hook

check host ServerA with address localhost
 if failed port 5433 protocol pgsql with timeout 30 seconds
  then exec "/etc/slack_notification.sh ServerA Postgresl FAIL"
   else if succeed exec "/etc/slack_notification.sh ServerA Postgresl OK"

Monitor process with Alert via Email

check process mysqld with pidfile /var/run/mysqld.pid
   if failed port 3306 protocol mysql then alert

Check remote host alive

check host Hostname with address www.yourremotehost.com
       if failed ping then alert

Check Disk amount usage

 check filesystem rootfs with path /
       if space usage > 90% then alert

Check Inode usage

 check filesystem rootfs with path /
       if inode usage > 90% then alert

Check CPU, Memory usage

check system $HOST
    if loadavg (5min) > 3 then alert
    if loadavg (15min) > 1 then alert
    if memory usage > 80% for 4 cycles then alert
    if swap usage > 20% for 4 cycles then alert
    # Test the user part of CPU usage 
    if cpu usage (user) > 80% for 2 cycles then alert
    # Test the system part of CPU usage 
    if cpu usage (system) > 20% for 2 cycles then alert
    # Test the i/o wait part of CPU usage 
    if cpu usage (wait) > 80% for 2 cycles then alert
    # Test CPU usage including user, system and wait. Note that 
    # multi-core systems can generate 100% per core
    # so total CPU usage can be more than 100%
    if cpu usage > 200% for 4 cycles then alert

Spring Annotation Cheatsheet

To create APIs

AnnotationFunctionVendor
@Controller, @RestControllerDefine a Controller (in MVC)Spring MVC
@RequestMapping, @GetMapping, @PostMapping, @PutMapping, @DeleteMappingDefine APIsSpring MVC
@PathVariable, @RequestParam, @RequestBodyBind to parameters on the API’s URLSpring MVC
@ServiceDefine a Spring component to be @Autowired in ControllerSpring MVC
@RepositoryDefine a Spring component interacting with the database, usually be @Autowired inside @Service or @ControllerSpring MVC
@BeanDefine a custom Spring bean, usually to be @Autowired on other components, or to override some built-in beans provided by librariesSpring
@ConfigurationDefine a Spring bean, that stores configurations for particular libraries or for the application.Spring
@AutowiredCreate reference to another beans without manually creating constructorsSpring
@ControllerAdvice & @ExceptionHandlerCreate custom exception handlersSpring
@Value(“${server.port}”)
Integer port;
Read values from application.propertiesSpring
Spring Web cheatsheet

To create database schema

Annotation & sample usagesFunctionVendor
@Entity
class Company {… }
Create a table with name companyJPA/Hibernate
@Entity(name=”my_user”)
class User {…}
// “user” is a reserved keyword on Postgres
Create a table with name my_userJPA/Hibernate
@Id @GeneratedValue
Long companyId;
Define primary key with autogenerated idJPA/Hibernate
@GeneratedValueCreate a field that value is auto generated using shared sequence tableJPA/Hibernate
@GeneratedValue(strategy = GenerationType.IDENTITY)Create a field that value is auto generated using sequence table per [entity+column]JPA/Hibernate
@Column(columnDefinition=”TEXT”)
String companyDescription;
Define column data typeJPA/Hibernate
@Column(columnDefinition=”boolean default true”)
Boolean active;
Define column data type and default valueJPA/Hibernate
@Column(name=”custom_column_name”)Specify column nameJPA/Hibernate
@Enumerated(EnumType.STRING)Tell hibernate to store enum value as String instead of ordinalJPA/Hibernate
@ManyToOne
Company employeeCompany;
Define Foreign key referencing to table companyJPA/Hibernate
@OneToOne
UserProfile sampleUseProfile;
Foreign key field with 1:1 constraint. JPA/Hibernate
@OneToMany
List<Address> sampleListAddress;
There will be a temporary table with 2 columns : company_id & address_idJPA/Hibernate
@OneToMany(fetch = FetchType.LAZY), @ManyToOne(fetch = FetchType.LAZY) , …To apply Lazy Loading technique to foreign key fieldsJPA/Hibernate
@Inheritance(strategy = InheritanceType.TABLE_PER_CLASS)
class ParentClass { … }

class ChildClass extend ParentClass { … }
There will be separated tables created for each child classes & parent classJPA/Hibernate
@Inheritance(strategy = InheritanceType.SINGLE_TABLE)
class ParentClass { … }
There will be only single table containing all columns of parent & child classesJPA/Hibernate
@Inheritance(strategy = InheritanceType.JOINED)
class ParentClass { … }
Each class has its table and querying a subclass entity requires joining the tablesJPA/Hibernate
@ElementCollection(targetClass=String.class)
List<String> sampleStringListField;
To store a List type field. There will be a temporary table created.JPA/Hibernate
@PrePersist
public void beforeCreated() {…}
Method will be executed before a new entity is createdJPA/Hibernate
@PreUpdate
public void beforeUpdated(){…}
Method will be executed before an existing entity is updatedJPA/Hibernate
@PreRemove()
public void beforeDeleted() {…}
Method will be executed before an entity is about to removedJPA/Hibernate
@PostPersist
public void afterCreated() {…}
Method will be executed after a new entity is createdJPA/Hibernate
@PostUpdatted
public void afterUpdated() { … }
Method will be executed after an existing entity is updatedJPA/Hibernate
@PostRemoved
public void afterDeleted() {…}
Method will be executed after an entity is about to removedJPA/Hibernate
JPA/Hibernate Cheatsheet

To create Query

@Repository
public interface UserRepository extends PagingAndSortingJpaRepository<User, Long> {
// Using Hibernate Query Language (HQL)
// auto generate query using method names
Optional<List<User>> findAllByActiveIsAndUsernameIs(Boolean active, String username);

// Using custom query
@Query("select c from Customer c where lower(concat(c.firstName, ' ', c.lastName)) like lower(concat(:q, '%') ) or lower(concat(c.firstName, ' ', c.middleName, ' ', c.lastName)) like lower(concat(:q, '%') )")
Optional<List<User>> search(@QueryParam String q);

// Join tables
@Query("select u,h from User u inner join House h on h.user_id = u.id where h.address like lower(concat(:q, '%') )")
Optional<List<User>> searchByAddress(@QueryParam String q);
}

POJO

Annotation & sample usagesFunctionVendor
@NoArgsConstructor
class UserDTO { … }
Generate no argument constructorLombok
@AllArgsConstructor
class UserDTO { … }
Generate a constructor with all argumentsLombok
@Data
class UserDTO { … }
Generate getter & setter methodsLombok
@Builder
class UserDTO { … }
Generate builderLombok
@SuperBuilder
class UserDTO { … }
Generate builder , used for inheritance caseLombok
@Mapper
interface UserMapper {
public static UserMapper instance = Mappers.getClass(UserMapper.class)

}
Define a Mapper objectMapstruct
@Mapping(target=”field_a”, source=”field_b”, formatter=”formatterBeanName”)
@Mapping(source=”password”, ignore=true)
UserDTO fromUser(User user);


User fromDTO(UserDTO dto);
Configure Mapping of methods inside @MapperMapstruct
POJO code generating

Read custom configurations from application.properties

@Configuration
@ConfigurationProperties(prefix="setting")
public class SettingProperties {
 String prop; 
 Long value;
 Boolean enabled;
 ....// ** standard getters & setters 
}
=====
@Configuration
@EnableConfigurationProperties(SettingProperties.class)
public class SettingConfig {
 SettingProperties settingProperties;
 public SettingConfig(SettingProperties props) {...}
}
====
//application.properties
setting.prop="Some text"
setting.value=3
setting.enabled=true

Async methods

@Service
public class SomeService {
 @Autowired
 SomeRepository someRepository;
 
 @Asyn
 public void someAsyncMethod(String param) {
 someRepository.findAll();
 /// ... 
 }

}

// to invoke
someService.someAsyncMethod("...");

Cron job

@Configuration
@EnableScheduling // enable cron job support
public class CronJobConfig {
    ...
}
@Scheduled(fixedDelay = 1000)
public void scheduleFixedDelayTask() {
    System.out.println(
      "Fixed delay task - " + System.currentTimeMillis() / 1000);
}
@EnableAsync
public class ScheduledFixedRateExample {
    @Async
    @Scheduled(fixedRate = 1000)
    public void scheduleFixedRateTaskAsync() throws InterruptedException {
        System.out.println(
          "Fixed rate task async - " + System.currentTimeMillis() / 1000);
        Thread.sleep(2000);
    }

}
@Scheduled(cron = "0 15 10 15 * ?")
public void scheduleTaskUsingCronExpression() {

    long now = System.currentTimeMillis() / 1000;
    System.out.println(
      "schedule tasks using cron jobs - " + now);
}

Microservices Tradeoffs and Design Patterns

Let aside the reason why we should and should not jump into Microservices from previous post , here we talk more about what Tradeoffs of Microservices and Design Patterns that are born to deal with them.

Building Microservices is not easy like installing some packages into your current system. Actually you will install a lot of things :). The beauty of Microservices lies on the separation of services that enable each module to be developed independently and keep each module simple. But that separation also is the cause of new problems.

More I/O operations ?

First issue that we can easily to recognize is the emerging of I/O calls between separated services. It exactly looks like when we integrate our system to 3rd party services, but this time, all that 3rd party services is just out internal ones. To have correct API calls, there will be efforts to document and synchronize knowledge between teams handling different services.

But here is the bigger problem, if every services has to keep a list of another services addresses (to call their APIs), they become tight coupled, means strong dependent between each other and it destroys the promised scalability of Microservices. So it is when the Event-Driven style comes to rescue.

Event Driven Design Pattern

Example tools : RabbitMQ, ActiveMQ, Apache Kafka, Apache Pulsar, and more

Main idea with this pattern is to allow services not need to know about each others addresses. Each service just need to know an event pipe, or a message broker and entrust it for distributing its message and feeding back data from other services. There will be no direct API call between services. Each services only fires some events to the pipe, and listen on some events happened from the pipe.

Along with this design pattern, the mindset on how to storing data is required some escalations too. We will not only store STATE of entities, but also store the stream of EVENTs that construct that STATE. This storing strategy also is very effective when dealing with concurrent modifications on the same entity that can cause inconsistent in data. There are 2 approaches to store and consume events : by using the Queue and using the Log that we will discover in later topics.

More Complex Query Mechanism ?

It is obviously there will be moments that we need to query some data that need the co-operation between multiple services. In the past with monoliths style, when all data of all services is located in the same database, writing an SQL query is simple. But in Microservices style, it can’t. Each service secures its own database as a recommended practice. We suddenly can’t JOIN tables, we lost the out-of-the-box rollback mechanism from database’s Transaction feature in case of something wrong with storing data, we may have a longer delay while each service may have to wait for data from other services. And those obstacles turn Event Driven to be a “must have” design for Microservices system since that design is the foundation to support patterns solving this Querying issue, most common are Event Sourcing, CRSQ, and Saga.

Event Sourcing

It can be a bit confusing between terms Event Driven vs Event Sourcing. Event Driven is about communication mechanism between services , since Event Sourcing is about coding solution inside each service to retrieve a state of an entity: instead of fetching the entity from the database, we reconstruct it from an event stream. The event stream can be stored in many ways: it can be stored on a database’s table, or it can be read from Event-Driven supported components such as Apache Kafka, or RabbitMQ, or using some dedicated event stream database like EventStore, etc. This method brings new responsibility to developers that they will have to create and maintain the reconstructing algorithms for each type of entity .

As mentioned at previous section, this strategy is helpful when dealing with concurrent data modification scenario, something like collaboration features that can be seen in Google Docs or Google Sheets, or simply to deal with scenario that 2 user hit “Save” on the same form at very closed moments. But this reconstructing way is not so friendly to a more complex query which is so natural with traditional database like Oracle or PostgresSQL, the SELECT * WHERE ones. So, to cover this drawbacks, each service usually also maintain a traditional database to store states of entities and using it for querying. And this combination form a new pattern called : CQRS (Command and Query Responsibility Segregation) where the read and the write on an entity happens on different databases.

CQRS (Command and Query Responsibility Segregation)

As mention above, this pattern is to separates read and update operations for a data store. A service can use Event Sourcing technique for update an entity, or construct an memory based database such as H2 database to quickly store updates on entities, while as quick as possible to persist the calculated states of entities back to a SQL database for example. This pattern prevents the data conflict while there are many updates on a single entity come at the same time while also keep a flexible interface for query data.

This pattern is effective for scaling purpose since we can scale the read database and the write database independently, and fit for high load scenario when the writing requests can complete quicker because it reduces calls to database with potential delay from locking mechanism inside databases. Quicker response mean there will be more room for other requests, especially in thread-based server technology such as Servlet or Spring.

A drawback of this pattern is the coding complexity. There is more components join in the process, there will be more problem to handle. So it is not recommended to use this way in cases that the domain or business logics are simple. Simple features is nice fit with traditional CRUD method Overusing anythings is not good. I also want to remind that if the whole system does not have special needs on the load, or write-heavy features, it is not recommended to switch to Microservices too. (reason is here )

Saga

Saga means a long heroic story. And the story about Transaction inside Microservices is truly heroic and long. Transaction is an important feature for a database that aim to maintain the data consistency, it prevents partial failure when updating entities. With distributed services, we are having distributed Transactions. Now, the mission is how to co-ordinate those separated Transactions to regain attributes of a single Transaction : ACID (atomicity, consistency, isolation, durability) over distributed services . We can understand simply that : Saga is a design pattern aim to form the Transaction for Microservices.

Saga patterns is about what system must do if there is a failure inside a service. It should somehow reverse some previous successful operations to maintain data consistency. And the simplest way is to send out messages to ask some services to rollback some updates. To make a Saga, developers may have to anticipate a lot scenarios that an operation can fail. The more high level solution for rollback mechanism is to implement some techniques like Semantic lock or Versioning entity. We can discuss about this in other topics. But the point here is it also brings much complexities to the source code. The recommendation is to divide services well to avoid writing too much Saga. If there are some services that are tight coupled, we should think about merging them into one Monoliths service again. Saga is less suitable for tight coupled transaction.

More Deployment Effort ?

Back to Monoliths realm, the deployment means running a few command lines to build an API instances and to build a client side application. When go with Microservices, obviously we are having more than 1 instance, and we need to deploy each instance, one by one.

To reduce this effort, we can use some CI/CD tools such as Jenkins, or some available Cloud base CI/CD out there. We also can write ourself tools , it won’t be difficult. But there is still some more issues than just running command lines.

Log Aggregation

Logging is vital practice when building any kind of application to provide the picture of how system is doing and to troubleshoot issues. Checking logs on separated services can be not very convenient in Microservices so it is recommended to stream logs to one center. There are many tools dedicated for this purpose nowadays such as GreyLog or Logstash. The most famous stack for collecting, parsing and visualizing for now is ELK which is the combination of ElasticSearch + Logstash + Kibana. The drawback of those available logging technology is it requires a bit much RAM and CPU, mostly to support searching logs. For small projects, preparing a machine that is strong enough to run ELK stack may not very affordable. Logstash requires about 1-2 GB is plenty enough. GreyLog requires ElasticSearch so it also require about 8GB RAM and 4 Cores CPU. ELK is much more than that.

Health Check & Auto restart

Beside Logging, we also must have a way to keep track availability of services. Each service may have its own API /healthcheck that we can have a tool to periodically call to to check whether it’s alive or not. Or we can use proactive monitoring tools such as Monit or Supervisord to monitor ports / processes and configure its behavior when some errors occur, such as sending emails or notifications to the Slack channel.

Beside Heath Check, each service should have auto-restarting ability when something take it down. We can configure for a process to start up whenever the machine is up by adding scripts to /etc/init.d or /etc/systemd for most of Linux server. For processes, we can make use of Docker to automatically bring services up right after it is down. For the machine itself, if we use physical machine, we should enter BIOS and set up Auto-Restart when power is on. If we use Cloud machines, it is no worry.

Those techniques are not only recommended for Microservices but also for any Monoliths system to ensure the availability.

Circuit Breaker

This is for when bad things happen and we have no way to deal with it but accepting. There is always such situation is life. For some reasons, one or many services is down or become so slow due to network issues that it will makes user wait long just for a button click. Most of users are impatient and they will likely to retry the pending action, a lot and you know system can got worser. It is when a Circuit Breaker take action. It’s role is just similar to electric circuit breaker , is to prevent catastrophic cascading failure across system. The circuit breaker pattern allows you to build a fault tolerant and resilient system that can survive gracefully when key services are either unavailable or have high latency.

The Circuit Breaker must be placed between client and actual servers containing services. Circuit Breaker has 2 main states: Closed, Open. The rules among those states are:

  • At Closed state, Circuit Breaker just forward request from clients to behind services.
  • Once Circuit Breaker discovers a fail request or high latency, it change status to Open.
  • In Open state, Circuit Breaker will return errors to client’s requests immediately, so the user acknowledge the failure and it is better than let users wait, and it also reduces the load to the system.
  • Periodically, Circuit Breaker makes retry-call to behind services to check their availability. If behind services is good again, it changes to Closed state, if not it remain Open state.

Luckily we may don’t have to implement this pattern ourself. There are available tools out there such as : Hystrix – a part of Netflix OSS, or Istio – the community one

Service Discovery

As we mentioned at Event Driven section, services inside a Microservices no need to know each own addresses by using an Event channel. But what if the team does not familiar with Event style and decide not to use it, or the services is simple enough to just expose REST APIs only. Using Event Driven is not a must-do, and in this case, how do we solve the addressing problem between services.

When system need to be scaled, there will be more instances for one or many services need to be added, or removed, or just be moved around. To let every services know the address (IP , port ) of others, we need a man in the middle that keep the records about service’s addresses and keep it up to date. This module is called Service Discovery ad usually be used along with Load Balancing modules. We may discuss about this more on other topics.

We also no need to create this component from scratch. There are some tools out there such as : etcd, consul, Apache ZooKeeper. Let’s give a try with them.

Ending

Above is an overview of what we need to know when moving to Microservices. Make sure you google them all before really starting. Each of patterns will have its pros and cons and overcoming solutions that another topics will cover. Thanks for reading !!