The motivation for this project comes from “Sully”, a movie that depicts
an aircraft emergency landing on the Hudson River after a collision with
Canada geese. Due to an increasing awareness of climate change and
species adaptation, we also picked birds as our species of study because
we discovered that the rate of extinction for birds are unusually higher
than other species in our Mass Extinction module. Furthermore, we think
that the study of birdstrikes, migration and flight patterns is one that
needs more scholarly research. While we will attempt to make sense of
these phenomena in this project, we will draw upon separate sources for
inspiration.
We would like to reproduce a similar animation using the same eBird
database but in a different temporal scale to show the migration
patterns for 10 species of bird most striked by planes in the U.S. FAA
database.
Birdstrikes:
Here are the birdstrikes data obtained from the Federal Aviation
Administration.
## # A tibble: 6 x 24
## X1 INCIDENT_DATE INCIDENT_MONTH INCIDENT_YEAR TIME TIME_OF_DAY
## <dbl> <chr> <dbl> <dbl> <tim> <chr>
## 1 1 6/22/2010 0:~ 6 2010 23:45 Night
## 2 2 6/6/2016 0:00 6 2016 NA Day
## 3 3 6/18/2005 0:~ 6 2005 10:37 Day
## 4 4 3/25/2015 0:~ 3 2015 06:05 Night
## 5 5 10/11/1996 0~ 10 1996 NA Day
## 6 6 3/25/2009 0:~ 3 2009 12:15 Day
## # ... with 18 more variables: AIRPORT_ID <chr>, AIRPORT <chr>,
## # STATE <chr>, FAAREGION <chr>, LOCATION <chr>, DISTANCE <dbl>,
## # OPID <chr>, OPERATOR <chr>, AIRCRAFT <chr>, PHASE_OF_FLIGHT <chr>,
## # HEIGHT <dbl>, SPEED <dbl>, SKY <chr>, PRECIPITATION <chr>,
## # EFFECT <chr>, SPECIES <chr>, REMARKS <chr>, total_strikes <dbl>
It is worth it to note that the birdstrike data is inconsistent in that
there is data for 1944, 1982 to 2018, and has incomplete 2019 data since
we downloaded the data in November, 2019. For the purpose of our
analysis, we will split the strike data into 3 dataframes, the first has
28 years of birdstrikes 1990 - 2018 for the overall trend, the second
has 10 years of birdstrikes from 1990 to 2000, and third from 2000 to
2018.
We would like to look at the trends of birdstrike data to answer a
fundamental question: - What are the stike trends overall? Other helper questions are: - What months have higher strikes? - What seasons are those months? - What is the
relationship between birdstrikes, flights, climate change and bird
migration?
Let’s visualize the bird strike data and see the distribution of strikes
across 12 months in two periods, 1990-2000 and 2000-2018.
#first dataframe inclusivebirdstrikes_30yrs<-birdstrikes%>%filter(INCIDENT_YEAR>=1990&INCIDENT_YEAR<2019)#second dataframe 1st period.birdstrikes_10yrs_1<-birdstrikes%>%filter(INCIDENT_YEAR>=1990&INCIDENT_YEAR<2000)%>%group_by(INCIDENT_MONTH,INCIDENT_YEAR)%>%summarize(n=n())birdstrikes_10yrs_1$label<-"1990-2000"#third dataframe 2nd period.birdstrikes_10yrs_2<-birdstrikes%>%filter(INCIDENT_YEAR>=2000&INCIDENT_YEAR<2019)%>%group_by(INCIDENT_MONTH,INCIDENT_YEAR)%>%summarize(n=n())birdstrikes_10yrs_2$label<-"2000-2018"#creating boxplot to observe the differences in 2 periodsbirds_2periods<-rbind(birdstrikes_10yrs_1,birdstrikes_10yrs_2)birds_2periods%>%plot_ly(x=~INCIDENT_MONTH,y=~n,color=~label,colors="Paired")%>%add_boxplot()%>%layout(title="Monthly Summary Statistics for Birdstrikes in Two Periods",xaxis=list(title="Month",ticktext=list("Jan","Feb","Mar","Apr","May","June","July","Aug","Sep","Oct","Nov","Dec"),tickvals=list(1,2,3,4,5,6,7,8,9,10,11,12)),yaxis=list(title="Total Strikes"))
Before starting our analysis, we are making the assumption that more the
abudant presence of birds is highly correlated to their high number of
collision with plane. This means that a high collision rate for any bird
species indicates that this bird spcies is actively migrating.
According to the NOAA migration study referenced earlier:
“Bird species head out over the Atlantic Ocean during autumn migration
to spend winter in the Caribeean and South America follow a clockwise
looped trajectory and take a path father inland on their return
journey in the spring.”
We would then expect high migration rates which correlates with high
birdstrike counts are in fall months (September - December) and spring
(March - June). The data reflects these migration observations, however,
we also see the highest strikes in the Summer months.
The graph shows that out the two periods of 10 years, the more recent
period not only shows an exponential increase in the amount of bird
strikes, but it also shows a anomaly in May compared to the first
period. The higher strikes in May could be an indication of birds
activities adapting to earlier spring as an season shift
effect
due to climate change. We suspect that since Fall months are longer and
Winter arrives later, we can observe migration activities well into Nov
and Dec as well as bird activity throughout more months of the year.
Perhaps it would be easier to distinguish the differences in the two
periods with a seasonal graph. Lets bin our months into seasons and
verify what we hypothesized above.
#Binning months into respective seasonsbin_season<-function(x){case_when(x==3|x==4|x==5~"Spring",x==6|x==7|x==8~"Summer",x==9|x==10|x==11~"Fall",x==12|x==1|x==2~"Winter")}totaling_period<-function(period){period%>%group_by(INCIDENT_MONTH,Seasons,label)%>%summarize(means_month=sum(n))}birdstrikes_10yrs_1$Seasons<-bin_season(birdstrikes_10yrs_1$INCIDENT_MONTH)birdstrikes_10yrs_2$Seasons<-bin_season(birdstrikes_10yrs_2$INCIDENT_MONTH)period1<-totaling_period(birdstrikes_10yrs_1)period2<-totaling_period(birdstrikes_10yrs_2)twoperiods<-rbind(period1,period2)season<-twoperiods%>%ggplot(aes(INCIDENT_MONTH,means_month,colour=Seasons))+geom_point(alpha=1/6)+geom_line()+scale_x_discrete(name="",limits=c("Jan","Feb","Mar","Apr","May","June","July","Aug","Sep","Oct","Nov","Dec"))+labs(title="Seasonal Migration and BirdStrikes",y="Total number of strikes",x=" ")+theme_minimal()+facet_wrap(~label)+theme(plot.title=element_text(hjust=0.5),axis.text.x=element_text(colour="grey20",size=8,angle=90,hjust=0.5,vjust=0.5),axis.text.y=element_text(colour="grey20",angle=90,size=8),text=element_text(size=10))ggplotly(season)
We see here that aside from the increase in birdstrikes in the second
period (2000-2018), the usual bird migration seasons Fall and Spring
reflect the increase in birds and planes collisions. The Summer season
shows a much higher total birdstrikes, this is because some Western
Hemisphere bird species also migrate in the summer, or also known as
molt
migration
or postbreeding dispersal. This is true for both Red-tailed Hawks and
Gulls that we will see later in our migration analysis.
Flights - Factor 1:
A news article from U.S
Today
reported that there are several factors that contributed to this
increase inplane-bird collisons including “an increase in flights;
changing migratory patterns, etc…”
This is our flight data from 1990 to 2018 obtained from the Department
of Transportation for all nonstop commercial passenger traffic traveling
between international points and U.S. airports.
#Reading the international datasetplanes<-read.csv("dataset/International_Report_Passengers.csv")#Filtering for years between 1990 and 2018planes<-planes%>%filter(Year<2019&Year>1989)head(planes)
Let’s aggregate the birdstrike rates for the past 20 years from 1990 to
2018, exluding 2019 due to incomplete year. The number of strikes for
each year has increased from approximately 6 strikes/day in 1990 to 46
strikes/day in 2018 if we are to assume 360 days in a year.
#Calculate strike rate for birdsstrikerate<-birdstrikes_30yrs%>%group_by(INCIDENT_YEAR)%>%count()%>%mutate(rates=n/360)%>%rename(Year=INCIDENT_YEAR)strikerate$Year<-as.integer(strikerate$Year)strikerate$label<-"Bird Strikes"#Calculate flight rate for planesflightrate<-planes%>%count(Year)%>%mutate(rates=n/360)flightrate$label<-"Flights"#Lets visualize the rates of flight and strikes bird_flight<-dplyr::bind_rows(strikerate,flightrate)ani_strikes<-bird_flight%>%dplyr::select(Year,rates,label)%>%ggplot(aes(x=Year,y=rates,group=label,color=label))+theme_minimal()+labs(x='Year',y="Rates Per Day")+theme(text=element_text(size=16))+ggtitle("Rates of Strikes and Flights from 1990 - 2018")+geom_line()+geom_point()+transition_reveal(Year)#Make animation and saved as gif#strike_ani <- gganimate::animate(ani_strikes, 100, 20)#anim_save(strike_ani)
We can see the increasing trend between the 2 rates. It is explainable
by the fact that the birdstrikes data is distributed by the FAA which
records each plane-bird collision. We therefore can confirm our first
factor: increase in flight -> increase in strikes while holding other
factors constant.
Climate - Temperature - Factor 2:
With our confirm assumption that higher birdstrikes happen during bird
migrating seasons, let’s look at how climate temperature can have an
affect on bird activities with.
We are interested in whether hotter months are seen with more
birdstrikes because we suspect that hotter climate means harder
migrations, as stated by PhD student in Biological and Biomedical
Sciences at Harvard Medical
School,
these long migration journey from North to South America in the Fall and
South to North America in the Spring become incredibly dangerous for the
birds. As they might be leaving North America later in the Fall, around
the holiday season for the average American who usually books a flight
to celebrate with family members. These Fall migrating birds therefore
have a higher chance of getting strucks when more planes are on the
air. Additionally, as the temperature gets hotter, bird species that
rely on the seasonal for food and breeding may arrive earlier in Spring,
as consistent with our earlier observation that more birdstrikes were
seen in May from the 2000 to 2018 period than 1990 - 2000 period. In a
published paper using the same ebird dataset, Hurlbert and
Liang
confirms the early arrival to N.A by 0.8 days for every degree Celcius.
Let’s us take a look at the mean monthly temperatures and strikes for
1990 and 2012 to observe the changes. This is because we would like to
observe a more apparent change and reduce analysis runtime. Since the
data comes in a horizontal format with lats, long that we will be using
for our maps in the later part of this analysis, we will go ahead and
transform the data into a vertical format to compare with our
birdstrikes data. We will do so by transposing the dataframe and find
the mean temperature for each months recorded per year for the U.S. Notice we are not taking average temperature for the entire America
continent even though our later analysis of migration will be using N.A.
as the scope. This is because birds migrate across the continent for
breeding purposes that tied directly to the opposite nature of
seasonality between North and South America. If you take the average
temperature for N.A. as a whole, we will not be able to see the actual
temperature changes since they be cancelled out by opposite seasonality.
#Function to read and inteprete temperature datarename_temp<-function(dataset){newset<-dataset%>%rename(deciLongi=V1,deciLatit=V2,"1"=V3,"2"=V4,"3"=V5,"4"=V6,"5"=V7,"6"=V8,"7"=V9,"8"=V10,"9"=V11,"10"=V12,"11"=V13,"12"=V14)return(newset)}temp1990<-rename_temp(read.table("dataset/air_temp1990.txt")[1:14])temp2012<-rename_temp(read.table("dataset/air_temp2012.txt")[1:14])head(temp1990)
#function that reformat the temperatures filtered by the American continentmy_transpose<-function(data){trans<-data%>%filter(deciLongi>=-125.0011,deciLongi<=-66.9326,deciLatit>=24.9493,deciLatit<=50.0704)%>%subset(select=-c(deciLongi,deciLatit))%>%lapply(mean)%>%as.data.frame(long=TRUE)%>%gather()%>%rename(Month=key,Celcius=value)return(trans)}#transposing into vertical formatsummary_temp90<-my_transpose(temp1990)summary_temp12<-my_transpose(temp2012)#Monthly strikestemp_strikes_12<-birdstrikes_10yrs_2%>%filter(INCIDENT_YEAR==2012)%>%cbind(ave_temp=round(((summary_temp12$Celcius*9/5)+32),3))temp_strikes_19<-birdstrikes_10yrs_1%>%filter(INCIDENT_YEAR==1990)%>%cbind(ave_temp=round(((summary_temp90$Celcius*9/5)+32),3))head(temp_strikes_19)
Now that we have our temperatures formatted and converted into
Fahenheit. Let see what relationship exists between strikes and
temperature in the U.S. for the years of 1990 and 2000.
#create axis layouts, explicitlyR_Axis<-list(side="right",overlaying="y",title="Count of Birdstrikes",showgrid=FALSE,zeroline=FALSE)L_Axis<-list(side="left",title="Average Monthly Temperature",showgrid=FALSE,zeroline=FALSE)one_plot<-function(data){data%>%plot_ly(x=data$INCIDENT_MONTH)%>%add_trace(y=~n,name='Count of birdstrikes',type='scatter',mode="markers",yaxis='y2',line=list(color="red"),hoverinfo="text",text=~paste(n,"Strikes"))%>%add_trace(y=~ave_temp,type='bar',name="Temperature",yaxis='y',marker=list(color='lightblue'),hoverinfo="text",text=~paste(ave_temp,'F'))%>%layout(title="Bird Presence and Temperature(F) for 1990 and 2012",xaxis=list(title="Month"),yaxis2=R_Axis,yaxis=L_Axis)}p1<-one_plot(temp_strikes_19)p1%>%layout(title="Bird Presence and Temperature(F) for 1990")
## A line object has been specified, but lines is not in the mode
## Adding lines to the mode...
#There is a bug in plotly subplot, for this reason we will view them separately.
p2<-one_plot(temp_strikes_12)p2%>%layout(title="Bird Presence and Temperature(F) for 2012")
## A line object has been specified, but lines is not in the mode
## Adding lines to the mode...
It seems to show from our two plots that increasing temperature is not
associated with increasing birdstrikes. The higher birdstrikes in 2012
increased from Aug to Nov. This can affirm our suspicion that birds more
birds are leaving later in the Fall and higher number of planes in the
air during the holiday seasons can mean higher probability of their
collisions. We cannot however, confirm the other hypothesis that higher
temperature lead to changes in bird migration pattern as did our
referred paper. Again, these are not direct correlation and we
cannot uncertain that flight or temperature alone can cause birdstrikes
to increase. This is why we are going to look at birth migration as our
next factor.
Bird Migration - Factor 3:
Here is the NOAA Wester Hemisphere bird migration patterns: