The Robust Stream Cipher for Securing Data in the Smartphones

With the development of network and communication systems in large areas in the world, this leads to increase security problems in transmission of data such as data leakage, modification, unauthorized access, and attacks. There are many types of techniques that are used to prevent these problems and protect data. One of these techniques is a stream cipher which considered the strongest and fastest method used in encryption and decryption process. In this study presented a new design for the stream cipher to protect mobile data. The strength of stream cipher depends on it is' key. There are several methods to generate key. We used three types of generator. Then, it used the combiner to convert them into a nonlinear Boolean function in order to make the generator key more secure. To implement a new generator key by using these three kinds, we used four LFSRs and one of NLFSRs or FCSRs to produce five variables Boolean function. These variables will be as an input to the combiner function. Finally, we tested the generator and submitted it to the randomness tests that is publicly available in the National Institute of Standards and Technology (NIST).


Introduction:
In recent years, communication networking is spread in wide areas of the lands, where data transmission between devices became more susceptible to infection for many attacks.It should protect the data, where many researches are interested to studying the field of data security.Obviously, to design an efficient system for protect the data through encryption them, so it needed to use a kind of cryptography with fast in processing and as little cost as possible.Where consider the best method for encryption of data is stream cipher because it is very fast to implementation in software and hardware as well as it is more suitable for many applications and devices [1].There are some of papers works in lightweight stream cipher like [2], [3] and [4].
Stream ciphers are a category of symmetric cryptography.It means that, it is using a same key for a process of sending (encryption) and receiving (decryption).Where considered the pseudo random number generator is the most common way to generate key (or keystream), after that it is using XOR with the plaintext to produce ciphertext [1].Several algorithms of stream ciphers are proposed in [5]- [8], some algorithms have proved effective against attacks.In this paper, it proposed the new design of the stream ciphers that suitable for mobile communication systems.Hence, we are using some types of pseudo random number generator are LFSRs, FCSRs, and NLFSRs, after that it combined them into a combination functions to generate a keystream.Also, we used a new idea is a comparator.It put in between the NLFSRs and FCSRs to make the generator more robust due these two types are new and it difficult to analysis them so far.
The remainder of the paper is arranged as follows: Section 2 described the types of generator that it used to generate a keystream by explaining the principle work it in briefly.Section 3 described the proposed keystream generator design and we mentioned the properties for each type of registers as well as the combination functions.Section 4 the execution of the proposed of keystream was evaluated by using NIST.Section 5, we mentioned the conclusion from this paper.

Linear Feedback Shift Registers (LFSRs):
LFSRs are most widely recognized kind of shift registers that utilized as a part of cryptography.LFSRs consist of two parts as shown in Figure 1.One of them calls a shift register and the other calls feedback function.Shift register is a set of cells or stages that store one bit (1 or 0).The initial states of the LFSR called a seed.These bits are shifted to the right simultaneously by the external clock that is responsible for control of the movement of these bits.In addition, the new bit in the left most of that register is computed by the feedback function.The feedback function basically formed by the exclusive-or (XOR) of specific bits in the register, where these bits called a tap sequence (or Fibonacci configuration) [7].When implementing any register from type of LFSR, so it will be represented by using connection polynomial equation as shown in below [8].

P(x) = 1+∑ 𝒄𝒊𝑿 𝒊 . 𝑳 𝒊=𝟏
(1) Figure 1: Linear Feedback Shift Registers [8] LFSRs represent a basic component to generate a sequence of bits as a keystream, where it considers very suitably for the stream cipher applications.The implementation of the hardware as well as the software is very easy, and they require less cost and time to be implemented.Also, it produces a sequence of bits with excellent statistical features.

Feedback with Carry Shift Registers (FCSRs):
FCSRs are like LFSRs.Both of them contain two parts which are a shift register and feedback function.However, FCSRs have an additional part which is called a carry register as shown in Figure 2.Moreover, FCSRs do not use (XOR) in their function, but they used a function that called summation of integers.Where the summation of integers is formed by adding the values of the taps of sequence (active taps) with the value of carry register.The result of previous process is an integer and calls also a parity bit (σ).Then, the new bit or state can be calculated by taking mod two to the parity bit (σ mod2).Moreover, new carry bit calculated by using the equation.(⌊σ /2⌋), where the ⌊ ⌋ is the greatest integer or integer part [7].
When constructing any FCSR with excelent properties, then using the next equations.FCSR will represent by using an equation that it called a connection integer (q). it is' define in section 3 see definition 3.1 in [16] , where r is a taps q1, q2, q3, . . ., qr and it describes the number of active taps by the equation below [16].
Where q should be a prime number Also, it can determine the number of cells (stages) for the FCSRs by using the next equation [16].
Morever, through the result of the eq no.3, it can calculate the number of carry by the equation [16].
Finally, it must be chosen the initial states be carefully for this register.If the output of sequences s = ( 0 ,  1 ,  2 , …) that generated by the FCSRs with q is periodic, and if y = 2 −1 is the multiplicative inverse of q , then we conclude the output of sequence by the equation [16].
Figure 2: Feedback with Carry Shift Registers [15] FCSRs ,as memtioned in [7], is a new generator that can be used to produce a random key.It used a new technique in the arithmetic called a 2-adic number instead of using the arithmetic in 2 in both type of LFSRs and NLFSRs (where the 2 is a finite field with the binary numbers (1 and 0)).It gives a good statistical property and can be resist for many attacks to provide a built-in in nonlinearity.

Nonlinear Feedback Shift Register (NLFSR):
NLFSR is another type of generator that is used to construct a keystream as a form of pseudo random generator as shown in Figure 3.In recent years, the NLFSRs have gotten much consideration by researchers to design various novel cryptographic algorithms.The reason for that is that NLFSRs are more secure and stronger than LFSRs to be breakdown by existing cryptanalysis techniques.The state of NLFSRs contain nonlinear function that provide that security properties.However, NLFSR has many drawbacks.One of them is the output of the sequence may not always equal to the length of the maximal period expected.The other drawback is the output of bits for the type (n,k)-NLFSR with the period 2  − 1 does not achieve the first and second hypotheses of Golomb.These problems can be addressed by using the Fibonacci configuration [9] [10].

Figure 3: An example of Fibonacci NLFSRs [9].
Actually, NLFSRs consider the best methods and most complex to design keystream than the other types (FCSRs and LFSRs).In fact, it is not found any method or mathematical theory can analysis them until now.So, it is considered one of the most powerful ways to design the keystream [7].In this paper, it can be used one of types NLFSRs that is called a Fibonacci NLFSRs instead of used a Galois NLFSRs because it contains two main advantages.Firstly, any output of sequence is always achieve the first and second of Golomb's postulates [11] with the cycle of 2  − 1, while the Galois (L,k)-NLFSRs does not achieve any of first and second of Golomb's postulates.Secondly, the output of sequence always equal the length of maximal period, while the Galois (L,k)-NLFSRs is not necessarily equal the length of maximum period [12], [10].In other word,NLFSR Fibonacci give a pure cycles if the feedback function is a type of equation as shown below.

𝒇(𝒔 𝟎 , 𝒔 𝟏 , … , 𝒔 𝒏−𝟏 ) = 𝒙 𝟎 ⊕ 𝐠(𝒔 𝟎 , 𝒔 𝟏 , … , 𝒔 𝒏−𝟏 ) (6)
Where g is not depended on the value  0 .Due f is a type of DeBruijn, So the states ( 0  1 …  −1 ) and (  0  1 …  −1 ) each of them, the inputs have a different value because all the cases are depended on the value g.In the other meaning the value of g is the same in all cases but the values of  0 and  0 are different so that f cannot be a type of above equation no.6.

The proposed model
This section depicted in details the structure of the new design for the stream cipher to generate key on a form of Pseudo-Random Number Generator (PRNG).The security in the stream cipher depends on it is' key.It must be random to become more complex to be breakdown and hard to be predicated by attackers.
The proposed structure appears in  The first part is inputs that consist of many types of generators which are LFSRs, NLFSRs, and FCSRs.Inputs part contains of three types of generators to create a keystream.In this part we will explain properties, and specifications for each kind as shown below:

Properties of LFSR Sequences:
LFSR has many characteristics to generate a good sequence with perfect statistical properties.All the characteristics below were used to implement our 4-LFSRs generators.
1-LFSRs by the feedback coefficient produce a sequence of different length, where was represented by the polynomials (connection polynomial) as shown in the equation (1) 2-LFSRs generate sequence of bits with maximal period (longest pseudo random sequence before it repeated) to give us a sequence with good statistical properties.It must be used a type of equation called primitive polynomials, so the maximum sequence equal 2  − 1 [5]- [8].It preferably use LFSRs from a type of non-singular to be periodic.In other words, the degree of non-singular for feedback polynomial is equal to the LFSRs length.it will ignore any type of singular LFSRs because the generator will become ultimately periodic.[8] [5] 3-At the design, it should use a primitive polynomial that is a type of dense (not sparse) to improve a process of security for communication systems and applications in order to make attacks more difficult to penetrate it [8] [5] [13] 4-Linear complexity (L.C): linear span is the length of shortest LFSRs that it generates the output of sequence (S).The linear complexity should be large to make detection the length of registers extremely difficult and better at the design of our-generator.So, L.C was constructed very high.
Although the LFSRs are fast in implementation hardware and software and give a long period and good statistical properties but it is very weak to be hacked because it is very linear and it can detece the number of taps by using Berlekamp-Massey algorithm [14] [15].

Properties of FCSR sequences
In the following points, some characteristics were displayed which were used to implement this generator in our design.These points include knowing how to calculate the number of stages & carry for this register.In addition to, how to set the initial states for purpose of generating the random sequences.
1-FCSRs was represented by the connection integer (q) as shown in eq no.2.Through the connection integer (q), the number of stages in the FCSRs were determined by the equation no (3).In addition, the number of carry was determined (by using eq no.4) via the number of cells through the result of eq no. 3. The type of q was used in our design to be a prime number in order to generate a maximum period.
2-From the eq no.5, we can benefit to determine the initial loading of FCSRs, so when we generate the sequence for best long period, the place of the initial value for the register + carry of the FCSRs must be approached with carefully.3-To production a maximum possible sequence of FCSRs, so the period equals q-1.Or it can be calculated the period of FCSRs by () (it is the Euler's phi function and it equals the numbers of integers which is less than q and the GCD with q = 1).See the definition13.1 in [16].4-Any formal power series , which is 2-adic numbers can be giving us an infinite binary sequence s = ( 0 ,  1 ,  2 , …  ∞ ).The binary sequence (s) it will be a periodic if and only the 2-adic number is the rational number ( =   ).Where r and q are integers,  < 0 ,and || it must be < ||.

5-
The 2-adic span is the smallest size of FCSRs that generates the output of sequence (s).The range of 2adic identify by an effective method by using the 2-adic approximation theory (see [16] in section 10).
There is an algorithm to find the smallest FCSRs that generate the sequence a.It works with knowledge just of 2y+2log(y) the sequence of bits s (where y is the 2-adci span of s).This algorithm depended upon the de-Weger's theory [17].

Properties of NLFSR sequences.
Fibonacci-NLFSRs contain many important properties for producing a sequence with a huge period.It illustrates as in the following points [9].
1-NLFSR Fibonacci has pure cycles, so it was used in our design to be as a form of eq no.6 (see section 2.3).
2-Fibonacci NLFSRs have 2 2 −1 −+1 different L-bit with the sequence equal 2  − 1.This formula can be derived as follows, where Gn has 2  nodes, representing all possible states of an Lbit NLFSR, and 2 +1 edges that representing all possible transition between these states.Each node of Gn has two possible outputs and two possible inputs.
3-The form of cycle of length 2  − 1 is come from a cycle of length 2  ; we can delete the loop at node 00...0 and also the loop at node 11...1 of Gn.Since there are no other loops in Gn, there are precisely two cycles of length 2  − 1for each cycle of length 2  .Therefore, the number of cycles of the length 2  − 1 in the Gn equal 2.2 2 −1 − .

Second Part (Combination Functions (ƒ))
The previous types like LFSRs and FCSRs, when we generate keystream, it contains on a drawback.It is a linear, so it is very easy to detect the length of each register and the number of taps by using multi algorithms attacks such as Berlekamp-Massey algorithm.To avoid these problems, it should be using the types of nonlinear generator technique, either nonlinear combination generator, or nonlinear filter generator, or clock-controlled generator.In this paper, we used a combination generator to destroy the linearity in the different types of generator key.A combination generator is used in multiple stream cipher applications.It consists of several type of running-key generator in parallel that combine them to produce the keystream, as in the Figure 4 above.
A product of any result Boolean function (ƒ) called an  ℎ (resilient Boolean function).The expression of ƒ (Z1, Z2, . . ., Zn) is represented by the algebraic normal form (ANF) of f [18] .It contain of four parameters are number of variables, resiliency, algebraic degree, and nonlinearity (n,m,d,x).A resilient Boolean function in stream cipher must satisfy several criteria in order to resistant many attacks as possible in the same time and generate good stream cipher.These criteria are balance, high nonlinearity, high algebraic degree, and high algebraic immunity [19] [20].There are several interested researches to construct Resilient Boolean Functions with best cryptographic properties [19]- [22] Properties of Resilient Boolean Function

2-Nonlinearity
Boolean Function ƒ focus to produce a capacity accomplishing as high nonlinearity, if the Boolean function is balance, then the n-variables for odd (n) having a nonlinearity equal   −  −/ (7) If the Boolean function is balance, then the nonlinear for functions ƒ is at most n-m-1, for 1≤ m ≤ n-2 [5].

3-Algebraic Degree.
Algebraic degree of the Boolean function ƒ must be high, since all cryptosystems utilizing Boolean functions ƒ can be assaulted if the functions have low algebraic degrees.When the algebraic degree is increased, the linear complexity will also increase, so the design of generator will be more powerful and hence the Berlekamp-Massey calculation turns out to be computationally infeasible.The algebraic degree is not exceeding of d ≤ n-m-1 [8], [19].

4-Correlation Immunity.
The correlation immunity of ƒ must be high.The meaning of correlation immunity is ensures the output of the Boolean function ƒ cannot leakage information about the variables for each inputs.There are tradeoffs between high correlation immunity, high algebraic degree, and high nonlinearity.
There are some attacks infect the combination functions such as correlation attacks, fast correlation attacks, and algebraic attacks.The principle work of these attacks are tries detect the content of each register.If the length of each register and the combination functions are known, then the secret thing is the initial states.Therefore, most of the attacks on the combination functions works to identify the initial states of all types of registers through exploits presence the statistical dependence between the output of single register and the keystream.To make these attacks are practically useless, it must be used the four-LFSRs in the design from a type of not sparse.Also, it must be increasing the Correlation Immunity (CI) in the combining function when the function is balanced.Moreover the nonlinearity must be high to prevent the correlation attacks and fast correlation attacks from penetration [8] [5].
In the Figure 4 (proposed structure for generator key) of this paper, we used the inputs (registers) within the specifications required to generate the key and which mentioned in section 3.1.1,3.1.2and 3.1.3.In the design it should be the GCD between each register equal to 1 [8].Also the NLFSRs and FCSRs, we select the same length.The length of NLFSR we selected it from [9], and FCSR we selected his length from [7].The combination function with five variables (5,1,3,12) selected from [23].It must be select five from six variables through the comparator, where we place it in the NLFSR and FCSR to select one of them every time of period, For example every 100 period.So we divided the inputs into two group, group A include the four LFSRs with NLFSR, and group B include the four LFSRs with FCSR.The principle work of comparator depended on the contents of each two registers.If the output of combination function equal 1, it will comparator of the greatest value or equal in the contents of each register and vice versa it will compare the smallest value.The utility of using comparator is working to prevent of detect the initial states for NLFSR and FCSRs and hence the attacker can not to determine any group is switch on.

Results:
This section tested the generator key and displayed the experimental results by using National Institute of Standards and Technology (NIST) that explained in [24].The NIST test suite is statistical bundles consist of few tests.The purpose of NIST test is to test the randomness of binary sequences that is produce through pseudorandom number generator.

Result of the NIST Statistical Test:
In this section, we will display the result of NIST, where we tested five tests of NIST on the proposed keystream generator design (see Figure 4 in section 3).The sequence n has successfully passed in the five tests of NIST tests, and we calculated the P-value for each test and the result is > 0.01.It concludes from this design, it has excellent statistical properties and it considers random.Also, it can resist many known attacks.

Result of the Frequency (Monobit) Test
Figure 5 demonstrates the relationship between p-value (in y-axis) and number of bits (in x-axis).We set the value of number of bits (n) to be between 100 to 10 6 bits.P-value was almost 1 when n=100.However, p-value was 0.15 when the n =500.In both cases, p-values were larger than 0.1, and this is what we would like to see.Because as mentioned in [24], if the p-value is larger than 0.01, then the key is randomness enough and it is very hard to be broke.When we set number of bits to be 103 and 105 consecutively, the p-value were close to 0.9 which is very good result.Moreover, when we chose the number of bits to be 6*105, the p-value was close to 0.7.In addition, when we set number of bits to be 2*105, 3*105, 4*105, 7*105, 8*105, 9*105, and 106, the results of p-value were fall between 0.4 and 0.6.Finally, all of points passed this test because all p-values' points were more than 0.01.

Result for Frequency Test within a Block:
The figure 6 in below illustrates the results of the frequency test within a block.It notices that the output of sequence for n=100, the p-value= 0.616.While the other sequences, the P-value is stable at the value 1. Note: When we using this test for many output of sequence, the value M (number of bits for each block) is change and it should be M ≥ 0.1n and the number of blocks N<100 [24].For instance, when we test the sequence n =100 the value of number of bits in each block is equal M = (0.1*100)=10, and as in the following sequences.

Result for Runs Test
In this test, we used several values for the number of bits that is recommended for the NIST.We started ascending at n ≥100.In this chart, we notice that all values used have successfully passed the test and are considered random.For example when we generate a sequence of bits (n) equal 100 bits so we notice that the p-value = 0.07.And also when doubling n into 200 bits the p-value has increased to 0.14.Also, when we took n = 500 bits the p-value increased more to become 0.85.In this Figure (7), we notice that the peak value of p equal approximately to 1 when we used n equal 1000 and 10000 bits.we tested the other bits (n) that equal between 10 5 to 10 6 bits, we found that the most p-value ranging from 0.07 to 0.41.

. Result for the Longest Run of Ones in a Block
In this test, we show the results by two figures.Where we take three lengths of n are 128, 6272, and 750000 at minimum and the M equal 8, 128, and10 4 .The result of the p-value in the first n =128 is equal to 0.23.The results for the length of the others are illustrate in the two figures in below.Where the M is constant (128, and10 4 ) and we changed the lengths from 6272 to 10 6 in Figure 8 and 750000 to 10 6 in Figure 9.All the values that selected in the two figures were the p-value greater than 0.01, so all the sequences are random.

Conclusion
In this study, a new design was proposed for the stream cipher through generation of Pseudo Random Number Generator (PRNG) which are used and considered appropriate in modern communication systems.This way consists of three types of generators that we have introduced into the combiner to convert them into a nonlinear Boolean function in order to make the key algorithm more secure.Through the obtained results and security analysis we can say that the proposed design obtained is very strong, and its' key is random and very hard to be broken.All these features demonstrate that our broadcast algorithm encryption is appropriate to encrypt data before transmission over public transport channels.

Future work
First of all, we are going to test the rest of the NIST tests to evaluate our design.In addition, we are goimg to create a new application for android mobile phone based on our design.Finally, we are planning also to encrypt voice and video by using our design.

CONFLICT OF INTERESTS.
-There are no conflicts of interest.

Figure 4 .
It comprises of two parts.The first part is inputs and the second part called a Combination Boolean Function with five variables.

Figure 4 :
Figure 4: Proposed design for key generator

Figure 5 :
Figure 5:The p-value for The Frequency (Monobit) Test

Figure 6 :
Figure 6: The P-value for The Frequency Test within a Block

Figure 8 :Figure 9 :
Figure 8: The p-value for Test for Longest Run of Ones in a Block (n= 6272-1000000)