Apache Storm interview questions

11. When will you call the clean-up method in Apache Storm?
The cleanup method is called when a Bolt is being shutdown and should cleanup any resources that were opened. There’s no guarantee that this method will be called on the cluster: for example, if the machine the task is running on blows up, there’s no way to invoke the method. The cleanup method is intended for when you run topologies in local mode (where a Storm cluster is simulated in process), and you want to be able to run and kill many topologies without suffering any resource leaks.

12. How to set up SSL for Apache Storm?
For UI users needs to set following config in storm.yaml. Generating keystores with proper keys and certs should be taken care by the user before this step.
ui.https.port
ui.https.keystore.type (example “jks”)
ui.https.keystore.path (example “/etc/ssl/storm_keystore.jks”)
ui.https.keystore.password (keystore password)
ui.https.key.password (private key password)
optional config 6. ui.https.truststore.path (example “/etc/ssl/storm_truststore.jks”) 7. ui.https.truststore.password (truststore password) 8. ui.https.truststore.type (example “jks”)
If users want to setup 2-way auth 9. ui.https.want.client.auth (If this set to true server requests for client certifcate authentication, but keeps the connection if no authentication provided) 10. ui.https.need.client.auth (If this set to true server requires client to provide authentication)

13. Apache Kafka vs Apache Storm
a. Data Security
i. Apache Kafka
Basically, Kafka does not guarantee data loss, or we can say it have the very low guarantee. For Example, for 7 Million message transactions per day, Netflix achieved 0.01% of data loss.
ii. Apache Storm
On comparison with Kafka, Storm guarantees full data security.
b. Data Storage
i. Apache Kafka
Apache Kafka store its data on the local filesystem, such as EXT4 and XFS.
ii. Apache Storm
On the other hand, Storm is just a data processing framework. That says it doesn’t store data it just transfers it from input to Output stream.
c. Real-time messaging system
i. Apache Kafka
Before processing only, Kafka used to store incoming messages.
ii. Apache Storm
However, Storm works on a Real-time messaging system.
d. Processing/ Transforming
i. Apache Kafka
We use Apache Kafka for processing the real-time data.
ii. Apache Storm
Whereas, we use Storm for transforming the data.
e. Data Source
i. Apache Kafka
Basically, Kafka pulls the data from the actual source of data.
ii. Apache Storm
On the other hand, Storm gets the data from Kafka itself regarding further processes.
f. Basic Task
i. Apache Kafka
While it comes to transferring real-time application data from the source application to another, we use Kafka application.
ii. Apache Storm
Well, we use Storm for aggregation as well as computation purpose.
g. Zookeeper Dependency
i. Apache Kafka
While setting up the Kafka, it’s mandatory to have Apache Zookeeper.
ii. Apache Storm
Whereas, we don’t need Zookeeper to make Storm work.
h. Fault-Tolerant
i. Apache Kafka
Due to Zookeeper, Kafka is fault tolerant.
ii. Apache Storm
The storm is capable of auto-restart its daemons itself.
i. Inventor
i. Apache Kafka
Kafka is invented by LinkedIn.
ii. Apache Storm
Whereas, Twitter invented Apache Storm.
j. Language Support
i. Apache Kafka
Basically, Kafka can work with all languages but while it comes to work best, Kafka works best with Java language only.
ii. Apache Storm
Strom supports all the languages.

14. Does Apache Storm UI supprots REST API
The Storm UI daemon provides a REST API that allows you to interact with a Storm cluster, which includes retrieving metrics data and configuration information as well as management operations such as starting or stopping topologies.
The API base URL would thus be:
http://<ui-host>:<ui-port>/api/v1/…

15. What happens when a worker dies in Apache Storm?
When a worker dies, the supervisor will restart it. If it continuously fails on startup and is unable to heartbeat to Nimbus, Nimbus will reschedule the worker.

Author: user

Leave a Reply