Explain what is Apache Storm? What are the components of Storm?

Apache storm is an open source distributed real-time computation system used for processing real time big data analytics. Unlike Hadoop batch processing, Apache storm does for real-time processing and can be used with any programming language.

Components of Apache Storm includes

Nimbus: It works as a Hadoop’s Job Tracker. It distributes code across the cluster, uploads computation for execution, allocate workers across the cluster and monitors computation and reallocates workers as needed

Zookeeper: It is used as a mediator for communication with the Storm Cluster

Supervisor: Interacts with Nimbus through Zookeeper, depending on the signals received from the Nimbus, it executes the process.

Does Apache act as a Proxy server?

Yes, It acts as proxy also by using the mod_proxy module.

Why Apache Storm is the first choice for Real Time Processing?

-Easy to operate: Operating storm is quiet easy

-Real fast: It can process 100 messages per second per node

-Fault Tolerant: It detects the fault automatically and re-starts the functional attributes

-Reliable: It guarantees that each unit of data will be executed at least once or exactly once

-Scalable: It runs across a cluster of machine

What is multiviews?

A MultiViews search is enabled by the MultiViews Options. If the server receives a request for /some/dir/foo and/some/dir/foo does not exist, then the server reads the directory looking for all files named foo.*, and effectively fakes up a type map which names all those files, assigning them the same media types and content-encodings it would have if the client had asked for one of them by name. It
then chooses the best match to the client’s requirements,and returns that document.

Explain how data is stream flow in Apache Storm?

In Apache storm, data is stream flow with three components Spout, Bolt and Tuple

-Spout: A spout is a source of data in Storm

-Bolt: A bolt processes these data’s

-Tuple: Data is passed as Tuple

Does Apache include a search engine?

Yes, Apache contain a Search engine. You can search a report name in Apache by using the “Search

Explain how you can streamline log files using Apache storm?
To read from the log files you can configure your spout and emit per line as it read the log. The output then can be assign to a bolt for analyzing

Why does not Apache include SSL?

SSL (Secure Socket Layer) data transport requires encryption, and many governments have restrictions upon the import, export, and use of encryption technology.

If Apache included SSL in the base package, its distribution would involve all sorts of legal and
bureaucratic issues, and it would no longer be freely available.

Also, some of the technology required to talk to current clients using SSL is patented by RSA Data Security, who restricts its use without a license.

Explain what streams is and stream grouping in Apache storm?

In Apache Storm, stream is referred as a group or unbounded sequence of Tuples while stream grouping determines how stream should be partitioned among the bolt’s tasks.

Does Apache include any sort of database integration?

No. Apache is a Web (HTTP) server, not an application server. The base package does not include any such functionality. See the PHP project and the mod_perl project for examples of modules that allow you to work with databases from within the Apache environment.

List out different stream grouping in Apache storm?

-Shuffle grouping

-Fields grouping

-Global grouping

-All grouping

-None grouping

-Direct grouping

-Local grouping

Does Apache come with Java support?

The base Apache Web server package does not include support for Java

Mention how storm application can be beneficial in financial services?

In financial services, Storm can be helpful in preventing

-Securities fraud

-Order routing


-Compliance Violations

Can we use Active Server Pages (ASP) with Apache?

Apache Web Server package does not include ASP support. However, a number of projects provide ASP or ASP-like functionality for Apache.

Some of these are:



Explain what is Topology_Message_Timeout_secs in Apache Storm?

The maximum amount of time allotted to the topology to fully process a message released by a spout. If the message in not acknowledged in given time frame, Apache storm will fail the message on the spout.

While installing, why does Apache have three config files – srm.conf, access.conf and httpd.conf?

The first two are remnants from the NCSA times, and generally you should be ok if you delete the first two, and stick with httpd.conf.

Explain how to write the Output into a file using Storm?

In Spout, when you are reading file, make File Reader object in Open() method, as such that time it initialises the reader object for worker node. And use that object in next Tuple() method.

How to to stop Apache?

To stop apache you can use.

/etc/init.d/httpd stop command.

Mention what is the difference between Apache Kafka and Apache Storm?

Apache Kafka: It is a distributed and robust messaging system that can handle huge amount of data and allows passage of messages from one end-point to another.

Apache Storm: It is a real time message processing system, and you can edit or manipulate data in real time. Apache storm pulls the data from Kafka and applies some required manipulation.

How to check for the httpd.conf consistency and any errors in it?

We can check syntax for httpd configuration file by using following command.

httpd –S

Explain when using field grouping in storm, is there any time-out or limit to known field values?

Field grouping in storm uses a mod hash function to decide which task to send a tuple, ensuring which task will be processed in the correct order. For that, you don’t require any cache. So, there is no time-out or limit to known field values.

What is Server Type directive in Apache Server?

It defines whether Apache should spawn itself as a child process (standalone) or keep everything in a single process (inetd). Keeping it inetd conserves resources. This is deprecated, however.

In which folder are Java Applications stored in Apache?

Java applications are not stored in Apache, it can be only connected to a other Java webapp hosting webserver using the mod_jk connector

What is mod_vhost_alias?

This module creates dynamically configured virtual hosts, by allowing the IP address and/or the Host: header of the HTTP request to be used as part of the pathname to determine what files to serve. This allows for easy use of a huge number of virtual hosts with similar configurations.

What is struct and explain its purpose?

A struts is a open source framework for creating a Java web applications.

Tell me Is running apache as a root is a security risk?

No.root process opens port 80, but never listens to it, so no user will actually enter the site with root rights. If you kill the root process, you will see the other kids disappear as well.

