Статьи

Реальный мир Azure: перенос Drupal с LAMP на Windows Azure

В прошлом месяце группа по взаимодействию в Microsoft рассказала о проделанной работе по переносу веб-сайта Drupal наград Гильдии актеров из среды Linux-Apache-MySQL-PHP (LAMP) на платформу Windows Azure: веб-сайт SAG Awards Drupal переходит на Windows Azure . Этот шаг стал результатом сотрудничества между инженерами и инженерами SAG Awards из группы взаимодействия Microsoft и консультативной группы клиентов (CAT). Этот шаг позволил сайту SAG Awards справиться с устойчивым всплеском трафика во время шоу SAG Awards в январе. С тех пор у меня была возможность поговорить с некоторыми инженерами, которые помогли с переездом. В этой статье я опишу проблемы и шаги, предпринятые при перемещении веб-сайта SAG Awards из среды LAMP на платформу Windows Azure.

Фон

В Гильдии киноактеров (SAG) является крупнейшим объединением Соединенных Штатов , представляющий рабочие актеров. В январе каждого года, начиная с 1995 года, SAG принимает награды Гильдии киноактеров(SAG Awards) в честь исполнителей в кинофильмах и сериалах. В 2011 году на сайт SAG Awards Drupal, развернутый в стеке LAMP, повлияли сбои в работе сайта и низкая производительность в дни пиковой нагрузки, когда SAG пришлось постоянно обновлять свое оборудование для удовлетворения спроса в те дни. Это обновленное оборудование не использовалось оптимально в течение остальной части года. В конце 2011 года инженеры SAG Awards начали работать с инженерами Microsoft, чтобы перенести свой веб-сайт в Windows Azure в преддверии своего шоу в 2012 году. В январе 2012 года на веб-сайте SAG было более 350 000 уникальных посетителей и 1,1 миллиона просмотров страниц, а во время шоу трафик достиг более 160 000 посетителей.

Обзор и проблемы

Во многих отношениях веб-сайт SAG Awards был идеальным кандидатом для Windows Azure. На веб-сайте в течение большей части года наблюдается умеренный трафик, но наблюдается устойчивый всплеск трафика незадолго до, во время и после показа наград в январе. Гибкая масштабируемость и быстрое хранение данных, предлагаемые платформой Azure, были разработаны для такого типа использования.

Основная проблема, с которой столкнулись SAG Awards и инженеры Microsoft при переносе веб-сайта SAG Awards на Windows Azure, заключалась в создании очень высокого и стабильного всплеска трафика при одновременном удовлетворении необходимости администраторов SAG Awards часто обновлять мультимедийные файлы во время шоу наград. Как интеллектуальное использование хранилища BLOB-объектов Windows Azure, так и пользовательский модуль для аннулирования кэшированных страниц при обновлении контента были ключом к обеспечению положительного пользовательского опыта.

Примечание . В этой статье я остановлюсь на том, как веб-сайт Drupal был перемещен в Windows Azure, а также на то, как содержимое и данные были перемещены в хранилище BLOB-объектов Windows Azure и SQL Azure. Я не буду раскрывать детали стратегии кэширования.

Процесс перемещения веб-сайта SAG-Awards из среды LAMP на платформу Windows Azure можно разбить на пять этапов высокого уровня:

  1. Экспорт данных. Пользовательская команда Drush ( portabledb-export ) использовалась для создания дампа базы данных MySQL. .Zip архив медиа-файлов был создан для дальнейшего использования.
  2. Установите Drupal на Windows. Файлы Drupal, которые включали установку в среде LAMP, были скопированы в Windows Server / IIS как начальный шаг в обнаружении проблем совместимости.
  3. Импортируйте данные в SQL Azure. Пользовательская команда Drush ( portabledb-import ) использовалась вместе с дампом базы данных, созданным на шаге 1, для импорта данных в SQL Azure.
  4. Скопируйте мультимедийные файлы в хранилище BLOB-объектов Azure. После распаковки архива .zip на шаге 1 CloudXplorer использовался для копирования этих файлов в хранилище BLOB-объектов Windows Azure.
  5. Упакуйте и разверните Drupal. Пакетный инструмент Azure cspack использовался для упаковки Drupal для развертывания. Развертывание осуществлялось через портал Windows Azure .

Примечание . Упомянутые выше команды portabledb созданы и поддерживаются Damien Tournoud.

Детали для каждого из этих шагов высокого уровня находятся в разделах ниже.

Экспорт данных

Microsoft and SAG engineers began investigating the best way to export MySQL data by looking at Damien Tournoud’s portabledb Drush commands. They found that this tool worked perfectly when moving Drupal to Windows and SQL Server, but they needed to make some modifications to the tool for exporting data to SQL Azure. (These modifications have since been incorporated into the portabledb commands, which are now available as part of the Windows Azure Integration Module.)

The names of media files stored in the file_managed table were of the form public://field/image/file_name.avi. In order for these files to be streamed from Windows Azure Blob Storage (as they would be by the Windows Azure Integration module when deployed in Azure), the file names needed to be modified to this form: azurepublic://field/image/file_name.avi. This was an easy change to make.

Because the SAG Awards website would be retrieving all data from the cloud, Windows Azure Storage connection information needed to be stored in the database. The portabledb tool was modified to create a new table, azure_storage, for containing this information.

Finally, to allow all media files to be retrieved from Blob Storage, the file_default_scheme table needed to be updated with the stream wrapper name: azurepublic.

Using the modified portabledb tool, the following command produced the database dump:

drush portabledb-export  —use-windows-azure-storage=true —windows-azure-stream-wrapper-name=azurepublic —windows-azure-storage-account-name=azure_storage_account_name —windows-azure-storage-account-key=azure_storage_account_key  —windows-azure-blob-container-name=azure_blob_container_name —windows-azure-module-path=sites/all/modules —ctools-module-path=sites/all/modules  > drupal.dump

Note that the portabledb-export command does not copy media files themselves. Instead, the local media files were compressed in a .zip archive for use in a later step.

Install Drupal on Windows

In order to use the portabledb-import command (the counter part to the portabledb-export command above), a Drupal installation needed to be set up on Windows (with Drush for Windows installed). This was necessary, in part, because connectivity to SQL Azure was to be managed by the Commerce Guys’ SQL Server/SQL Azure module for Drupal, which relies on the SQL Server Drivers for PHP, a Windows-only PHP extension. Having a Windows installation of Drupal would also make it possible to package the application for deployment to Windows Azure. For this reason, Microsoft and SAG Awards engineers copied the Drupal files from the LAMP environment to a Windows Server machine. The team incrementally moved the rest of the application to an IIS/SQL Server Express stack before moving the backend to SQL Azure.

Note: The Windows Server machine was actually a virtual machine running in a Windows Azure Web Role in the same data center as SQL Azure. The Web Role was configured to allow RDP connections, which the team used to install and configure the SAG website installation. This was done to avoid timeouts that occurred when attempting to upload data from an on-premises machine to SQL Azure.

There were, however, some customizations made to the Drupal installation before running the portabledb-import command. Specifically,

Some customizations to PHP were also necessary since this PHP installation would be packaged with the application itself:

  • The php_pdo_sqlsrv.dll extension was installed and enabled. This extension provided connectivity to SQL Azure.
  • The php_memcache.dll extension was installed an enabled. This would be used for caching purposes.
  • The php_azure.dll extension was installed and enabled. This extension allowed configuration information to be retrieved from the Windows Azure service configuration file after the application was deployed. This allowed changes to be made without having to re-package and re-deploy the entire application. For example, database connection information could be retrieved in the settings.php file like this:
$databases['default']['default']['driver'] = 'sqlsrv';
 
$databases['default']['default']['username'] = azure_getconfig('sql_azure_username');
 
$databases['default']['default']['password'] = azure_getconfig('sql_azure_password');
 
$databases['default']['default']['host'] = azure_getconfig('sql_azure_host');
 
$databases['default']['default']['database'] = azure_getconfig('sql_azure_database');

With Drupal running on Windows, and with the customizations to Drupal and PHP outlined above, the importing of data could begin.

Import Data to SQL Azure

There were two phases to importing the SAG Awards website data: importing database data to SQL Azure and copying media files to Windows Azure Blob Storage. As alluded to above, importing data to SQL Azure was done with the portabledb-import Drush command. With SQL Azure connection information specified in Drupal’s settings.php file, the following command copied data from the drupal.dump file (which was copied to Drupal’s root directory on the Windows installation) to SQL Azure:

drush portabledb-import —delete-local-files=false —copy-files-blob-storage=false —use-production-storage=true mysite.dump

Note: The copy-files-blob-storage flag was set to false in the command above. While the portabledb-import command can copy media files to Blob Storage, Microsoft and SAG engineers had some work to do in modifying media file names (discussed in the next section). For this reason, they chose not to use this tool for uploading files to Blob Storage.

The next step was to create stored procedures on SQL Azure that are designed to handle some SQL that is specific to MySQL. The SQL Server/SQL Azure module for Drupal normally creates these stored procedures when the module is enabled, but since Drupal would be deployed with the module already enabled, these stored procedures needed to be created manually. Engineers executed the stored procedure creation DDL that is defined in the module by accessing SQL Azure through the management portal.

After the import was complete, the Windows installation of the SAG Awards website was now retrieving all database data from SQL Azure. However, recall that the portabledb-export command modified the names of media files in the file_managed table so that the Drupal Azure module would retrieve media files from Blob Storage. The final phase in importing data was to copy media files to Blob Storage.

Note: After this phase was complete, engineers cleared the cache through the Drupal admin panel.

 

Copy Media Files to Blob Storage

The main challenge in copying media files to Windows Azure Blob Storage was in handling Linux file name conventions that are not supported on Windows. While Linux supports a colon (:) as part of a file name, Windows does not. Consequently, when the .zip archive of media files was unpacked on Windows, file names were automatically changed: all colons were converted to underscores (_). However, colons are supported in Blob Storage as part of blob names. This meant that files could be uploaded to Blob Storage from Windows with underscores in the blob names, but the blob names would have to be modified manually to match the names stored in SQL Azure.

Engineers used WinRAR to unpack the .zip archive of media files. WinRAR provided a record of all file names that were changed in the unpacking process. Engineers then used CloudXplorer to upload the media files to Blob Storage and to change the modified files names, replacing underscores with colons.

At this point in the migration process, the SAG Awards website was fully functional on Windows and was retrieving all data (database data and media files) from the cloud.

Package and Deploy Drupal

There were two main challenges in packaging the SAG Awards website for deployment to Drupal: packaging a custom installation of PHP and creating the necessary startup tasks.

Because customizations were made to the PHP installation powering Drupal, engineers needed to package their custom PHP installation for deployment to Windows Azure. The other option was to rely on Microsoft’s Web Platform Installer to install a “vanilla” installation of PHP and then write scripts to modify it on start up. Since it is relatively easy to package a custom PHP installation for deployment to Azure, engineers chose to go that route. (For more information, see Packaging a Custom PHP Installation for Windows Azure.)

The startup tasks that needed to be performed were the following:

  • Configure IIS to use the custom PHP installation.
  • Register the Service Runtime COM Wrapper (which played a role in the caching strategy).
  • Put Drush (which was packaged with the deployment) in the PATH environment variable.

The final project structure, prior to packaging, was the following:

\SAGAwards
    \WebRole
    \bin
        \php
        install-php.cmd
        register-service-runtime-COM-Wrapper.cmd
        WinRMConfig.cmd
        startup-tasks-errorlog.txt
        startup-tasks-log.txt
    \resources
        \drush
        \ServiceRuntimeCOMWrapper
        \WebPICmdLine
        (Drupal files and folders)

Finally, the Windows Azure Memcached Plugin was added to the Windows Azure SDK prior to packaging so that memcache would run on startup and restart if killed.

The SAG Awards website was then packaged using cspack and deployed to Windows Azure through the developer portal.

Summary

The work in moving the SAG Awards Drupal website to the Windows Azure platform was an excellent example of Microsoft’s commitment to supporting popular OSS applications on the Windows Azure platform. The collaboration between engineers from SAG and from Microsoft’s Interoperability and Customer Advisory Teams resulted in a win for SAG (the SAG Awards website was able to handle sustained spikes in traffic that it could not handle previously) and in valuable lessons learned for the Windows Azure team about supporting migration and scalability of OSS applications on the Azure Platform.

-Brian