ABSTRACT

This study examines the analysis of fixed-effect non-interactive unbalanced data by a method called Intra-Factor Design. And to derive this design for analysis, mathematically, the matrix version of the fixed

7

effect model, Yijk = µ + τi + βj + εijk, was used. This resulted to the definition and formation of many matrices such as the Information Matrix, L; Replication Vector, r; Incidence Matrix, N; Vector of adjusted factor A totals, q; Variance–Covariance Matrix, Q, which is the generalized inverse of the Information Matrix; and other Matrices. The Least Squares Method which gave birth to several normal equations was used to estimate for the parameters, τ and β mathematically. Then, an illustrative example was given to ascertain the workability of this Intra-Factor procedure in testing for the main effects under some stated hypothesis for significance. But, before testing for the variance component of the main effects on the illustrative data, it was necessary to first ascertain that the data is fixedeffect and that interaction is either absent or non-significant since our interest is on “Fixed-Effect Non-interactive Models”. Thereafter, the Analysis of Variance Components for Adjusted Factor A with Unadjusted Factor B effects was carried out. This gave the result that Adjusted Factor A effect, τ, was not significant; whereas, the Unadjusted Factor B effect, β, was significant. Also, the Analysis of Variance Components was performed for Unadjusted Factor A effect with Adjusted Factor B effect and it yielded similar result as that of Adjusted Factor A with Unadjusted Factor B effects. We therefore concluded that for a Fixed Effect, Noninteractive Unbalanced Data Analysis, the Method of Intra-Factor Design can be successfully employed.

CHAPTER ONE

INTRODUCTION

1.0 INTRODUCTION

Estimating variance component from unbalanced data is not as

straightforward as from balanced data. This is so for two reasons. Firstly,

several methods of estimation are available (most of which reduce to the

analysis of variance method for balanced data), but no one of them has yet

been clearly established as superior to others. Secondly, all the methods

involve relatively cumbersome algebra; discussion of unbalanced data can

therefore easily deteriorate into a welter of symbols, a situation we do our

8

best to minimize here. However, we shall review some works on

unbalanced data.

1.1 GENERAL OUTLAY OF UNBALANCED DATA

Balanced data are those in which every one of the subclasses of the

model has the same number of observations, that is, equal numbers of

observations in all the subclasses. In contrast, unbalanced data are those

data wherein the numbers of observations in the subclasses of the model

are not all the same, that is, unequal number of observations in the

subclasses, including cases where there are no observations in some

classes. Thus unbalanced data refers not only to situations where all

subclasses have some data, namely filled subclasses, but also to cases

where some subclasses are empty, with no data in them. The estimation of

variance components from unbalanced data is more complicated than from

balanced data.

In many areas of research such as this, it is necessary to analyze the

variance of data, which are classified into two ways with unequal numbers

of observations falling into each cell of the classification. For data of this

kind, special methods of analysis are required because of the inequality of

the cell numbers. This we shall attempt to solve in this research work.

1.2 PROBLEM INVOLVED IN RANDOM MODELS

The problem associated with the random effect models has been the

determination of approximate F-test in testing for the main effects say, A

and B using F-ratio. In this case there would be no obvious denominator for testing the hypothesis Ho: σ² = 0; for a level of Factor A crossed with

the level of Factor B in the model such as Xijk = μ + αi +βj + λij + εijk

9

where, Xijk is the kth observation (for k =1,2,…,nij )in the ith level of Factor A and jth level of Factor B; i= 1,2,…,p; j= 1,2,…,q;

µ is the general mean; αi is the effect due to the ith level of Factor A; βj is the effect due to the jth level of Factor B; λij is the interaction between the ith level of Factor A and jth level of Factor

B; εijk is the observation error associated with Xijk.

1.3 PROBLEM OF MIXED EFFECT MODELS

According to Henderson (1953), in the customary unbalanced random model from a crossed classification, the nature of the expected value of the sum of squares is such that: E (SSA) = n.. – Σni.2 σ²A + ∑ Σnij – Σn.j2 σ²B + ∑ Σnij2 – Σ Σnij2 σ2AB + n.. ni. n.. ni. n..

α -1 σ2e ………………………………………………………………..…(1.1)

This means that with unbalanced data from crossed classification in

random model, the expected value of every sum of squares contains every

variance component of the model. But in an unbalanced mixed, with α’s being fixed and not random effects, the expectation would then be:

E (SSA) = ∑ni.αi2 – Σni.αi /n.. + ∑ Σnij2 – Σn.j2 σ2β + ∑ Σnij2 – Σ Σnij2 σ2βα + n.. n.. ni. n..

α -1 σ2e …………………………………………………….…………….(1.2)

j i

b c

j

c

i i

a

j

j

b b j j

b

b

i j j a

a

i a

i

a a

i j i a b b b

10

It is observed that the first term of (1.2) which is a function of the fixed

effect is different from that in (1.1); and this occurs in expected values of

all sum of squares terms (except for SSE). And, more importantly, the

function of the fixed effects is not the same from one expected sum of

squares term to the next. For example, with the α’s fixed, E (SSE) contains the term,

∑ (Σnijαi)2 / n.j – (Σni.αi)2 /n.. which differs from the first term in E (SSA) of (1.2). Thus

E [SSA – SSB] does not get rid of the fixed effects even though it does

eliminate terms in µ. This is true generally in mixed models, expected

values of the sum of squares contain functions of the fixed effects that

cannot be eliminated by considering linear combinations of the sum of squares. This means that the equations E (SS) = Pσ2 + σ2ef of the random model takes the term E (SS) = Pσ2 + σ2ef + q in the mixed model, where q

is a vector of the quadratic functions of the fixed-effects in the model. Hence, σ² cannot be estimated and the analysis of variance method applied

to unbalanced data cannot be used for mixed as well as for fixed-effect

models. It yields biased estimators. Simply put, with unbalanced data, the

analysis of variance method for mixed and fixed effect models lead to

biased estimators of variance components.

Mixed models involve dual estimation problems – estimating both

fixed effects and variance components.

1.4 AIM / OBJECTIVE OF THE STUDY

The aim of this work is to analyze unbalanced fixed effect non

interactive model using the Intra-Factor Design.

1.5 SIGNIFICANCE OF THE STUDY

b

i

a

j

a

i

11

As a result of dual estimation problems of the mixed model with

unbalanced data which accounted for biased estimators of variance

components, Henderson (1953), designed a method to correct this

deficiency. This he does by his method 2 which uses the data first to

estimate fixed effects of the model and then using these estimators to

adjust the data. Variance components are estimated from the adjusted data

by the analysis of variance method. This whole procedure was designed so

that the resulting variance estimators were not biased by the presence of

the fixed effects in the model as they were with the analysis of variance

estimators derived from the basic data. So far as the criterion of

unbiasedness was concerned, this was certainly achieved by this method.

But the general method of analyzing data adjusted according to

some estimator of the fixed effects is open to criticism on other grounds

such as: it cannot be uniquely defined, and a simplified form of it, of

which Henderson’s Method 2 is a special case, cannot be used whenever

the model includes interactions between the fixed effects and the random

effects. As such, the need for the birth of this research “Analysis of

Unbalanced Fixed-Effect Non-interactive Model”.

1.6 SCOPE / LIMITATION OF THE STUDY

This study is basically limited to fixed effect non-interactive

models.

1.7 ORGANIZATION OF THE STUDY

The work was organized in five systematic chapters. Chapter one

is made up of the introduction to the study. Other topics considered in this

chapter include: the problem involved in random models, problem of

mixed effect models, the aim and objective of the study, significance of

12

the study, the scope/limitations of the study and the organization of the

study.

In chapter two, the related literatures were reviewed as to see what

various researchers had said or written about the topic under

consideration. In this chapter, the following subtopics were considered:

the test for interaction effect and elimination of the interaction effect.

In chapter three, the methodology used in the study was

systematically described as to enhance the understanding of the study. The

method of data analysis was also carefully derived mathematically.

The chapter four of this study shows the presentation and analysis

of data. In doing this, the following subtopics were considered: the least

square method of analyzing the unbalanced factor design, test of

hypothesis and illustrative example on the application of the unbalanced

factor design. An example was given to illustrate the analysis of

unbalanced data using the Intra-Factor Design.

Finally, chapter five which is the concluding chapter contains the

discussion of the main findings, conclusion and recommendations